如何匹配包含 BeautifulSoup 列表中的字符串的元素?
How to match elements containing string from BeautifulSoup list?
下面有input.html我
Input.html
https://jsfiddle.net/f86q7ubm/
并且我正在尝试将列表 allList
中的所有元素与 size=5 匹配,但是当我 运行 以下代码时,匹配内部没有值。
from bs4 import BeautifulSoup
fp = open("file.html", "rb")
soup = BeautifulSoup(fp,"html5lib")
allList = soup.find_all(True)
matching = [s for s in allList if 'size="5"' in s]
我做错了什么?
可能(应该)有更好的方法,但您可以这样做 str(s)
。您试图在非字符串对象中进行匹配:
from bs4 import BeautifulSoup
fp = open("file.html", "rb")
soup = BeautifulSoup(fp,"html5lib")
allList = soup.find_all(True)
matching = [s for s in allList if 'size="5"' in str(s)]
不确定这是否是您想要的,但更好的方法可能是:
allList = soup.find_all("font", {"size": "5"}) # you already have the matching elements here
soup = BeautifulSoup(html, 'html.parser')
for item in soup.findAll("font", {'size': 5}):
print(item.text)
输出:
TEXT S 5 MORE TEXT
TEXT S 5 MORE TEXT
TEXT S 5 MORE TEXT
下面有input.html我
Input.html https://jsfiddle.net/f86q7ubm/
并且我正在尝试将列表 allList
中的所有元素与 size=5 匹配,但是当我 运行 以下代码时,匹配内部没有值。
from bs4 import BeautifulSoup
fp = open("file.html", "rb")
soup = BeautifulSoup(fp,"html5lib")
allList = soup.find_all(True)
matching = [s for s in allList if 'size="5"' in s]
我做错了什么?
可能(应该)有更好的方法,但您可以这样做 str(s)
。您试图在非字符串对象中进行匹配:
from bs4 import BeautifulSoup
fp = open("file.html", "rb")
soup = BeautifulSoup(fp,"html5lib")
allList = soup.find_all(True)
matching = [s for s in allList if 'size="5"' in str(s)]
不确定这是否是您想要的,但更好的方法可能是:
allList = soup.find_all("font", {"size": "5"}) # you already have the matching elements here
soup = BeautifulSoup(html, 'html.parser')
for item in soup.findAll("font", {'size': 5}):
print(item.text)
输出:
TEXT S 5 MORE TEXT
TEXT S 5 MORE TEXT
TEXT S 5 MORE TEXT