■ compile 함수에서 ".*"을 사용해 greedy 방식으로 문자열을 구하는 방법을 보여준다.
▶ 예제 코드 (PY)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
import urllib.request import re httpResponse = urllib.request.urlopen("http://www.example.com") htmlBytes = httpResponse.read() httpResponse.close() html = str(htmlBytes).encode("utf-8").decode("cp949") pattern = re.compile(r"<.*>", re.I | re.S) list1 = pattern.findall(html) print(list1) """ ['<!doctype html>\\n<html>\\n<head>\\n <title>Example Domain</title>\\n\\n <meta charset="utf-8" />\\n <meta http-equiv="Content-type" content="text/html; charset=utf-8" />\\n <meta name="viewport" content="width=device-width, initial-scale=1" />\\n <style type="text/css">\\n body {\\n background-color: #f0f0f2;\\n margin: 0;\\n padding: 0;\\n font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;\\n \\n }\\n div {\\n width: 600px;\\n margin: 5em auto;\\n padding: 2em;\\n background-color: #fdfdff;\\n border-radius: 0.5em;\\n box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);\\n }\\n a:link, a:visited {\\n color: #38488f;\\n text-decoration: none;\\n }\\n @media (max-width: 700px) {\\n div {\\n margin: 0 auto;\\n width: auto;\\n }\\n }\\n </style> \\n</head>\\n\\n<body>\\n<div>\\n <h1>Example Domain</h1>\\n <p>This domain is for use in illustrative examples in documents. You may use this\\n domain in literature without prior coordination or asking for permission.</p>\\n <p><a href="https://www.iana.org/domains/example">More information...</a></p>\\n</div>\\n</body>\\n</html>'] """ |