如何使用re.findall得到url字符串?
How to use re.findall to get the url string?
"foldGroup.registerImage({ domId: 'listimg7', srcUrl: 'https://ec.yimg.com/ec/?url=https%3A%2F%2Fd3vv6xw699rjh3.cloudfront.net%2F9f689b-1904037587_1_160.jpg&t=1460964135&ttl=43200&maxWidth=160&maxHeight=160&sig=QSY1BP0sCebMxqEN6irjXQ--~C' });"
这是来自雅虎购物页面的 html 的一部分,例如:
https://shopping.yahoo.com/womens-intimate-apparel/?b=3937
我的问题是如何使用 Python 的 re.findall()
找到所有 img url?
re.findall(r"'https://.*?'", part_of_html)
re.findall(pattern, string, flags=0) Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.
"foldGroup.registerImage({ domId: 'listimg7', srcUrl: 'https://ec.yimg.com/ec/?url=https%3A%2F%2Fd3vv6xw699rjh3.cloudfront.net%2F9f689b-1904037587_1_160.jpg&t=1460964135&ttl=43200&maxWidth=160&maxHeight=160&sig=QSY1BP0sCebMxqEN6irjXQ--~C' });"
这是来自雅虎购物页面的 html 的一部分,例如:
https://shopping.yahoo.com/womens-intimate-apparel/?b=3937
我的问题是如何使用 Python 的 re.findall()
找到所有 img url?
re.findall(r"'https://.*?'", part_of_html)
re.findall(pattern, string, flags=0) Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.