Python - 如何使用 BeautifulSoup 在另一个 class 中定位一个 class?
Python - how do I target a class in another class using BeautifulSoup?
我正在学习用beautifulsoup和Python创建爬虫 3、我遇到了一个问题,我想在一个网站中获取的数据有多个class,这里是一个例子:
<tr class="phone">
<a href="..." class="number"></a>
</tr>
<tr class="mobile">
<a href="..." class="number"></a>
</tr>
这就是我想用 Python 做的事情:
for num in soup.findAll('a', {'class':'mobile -> number'}):
print(num.string)
我应该如何定位 class .mobile .number
?
find_all()
元素 class "number",然后遍历列表并打印 parent
的 class 为 [=20= 的元素].
for dom in soup.find_all("a", "number"):
# this returns a list of class names
for class in dom.parent()["class"]:
if class == "mobile":
print(dom.string)
或对 CSS 选择器样式使用 select()
for dom in soup.select("tr.mobile a.number"):
print(dom.string)
您可以使用 soup.select
to find items according to a CSS selector.
from bs4 import BeautifulSoup
html_doc = '''<tr class="phone">
<a href="tel:+18005551212" class="number"></a>
</tr>
<tr class="mobile">
<a href="+13034997111" class="number"></a>
</tr> '''
soup = BeautifulSoup(html_doc)
# Find any tag with a class of "number"
# that is a descendant of a tag with
# a class of "mobile"
mobiles = soup.select(".mobile .number")
print mobiles
# Find a tag with a class of "number"
# that is an immediate descendent
# of a tag with "mobile"
mobiles = soup.select(".mobile > .number")
print mobiles
# Find an <a class=number> tag that is an immediate
# descendent of a <tr class=mobile> tag.
mobiles = soup.select("tr.mobile > a.number")
print mobiles
我正在学习用beautifulsoup和Python创建爬虫 3、我遇到了一个问题,我想在一个网站中获取的数据有多个class,这里是一个例子:
<tr class="phone">
<a href="..." class="number"></a>
</tr>
<tr class="mobile">
<a href="..." class="number"></a>
</tr>
这就是我想用 Python 做的事情:
for num in soup.findAll('a', {'class':'mobile -> number'}):
print(num.string)
我应该如何定位 class .mobile .number
?
find_all()
元素 class "number",然后遍历列表并打印 parent
的 class 为 [=20= 的元素].
for dom in soup.find_all("a", "number"):
# this returns a list of class names
for class in dom.parent()["class"]:
if class == "mobile":
print(dom.string)
或对 CSS 选择器样式使用 select()
for dom in soup.select("tr.mobile a.number"):
print(dom.string)
您可以使用 soup.select
to find items according to a CSS selector.
from bs4 import BeautifulSoup
html_doc = '''<tr class="phone">
<a href="tel:+18005551212" class="number"></a>
</tr>
<tr class="mobile">
<a href="+13034997111" class="number"></a>
</tr> '''
soup = BeautifulSoup(html_doc)
# Find any tag with a class of "number"
# that is a descendant of a tag with
# a class of "mobile"
mobiles = soup.select(".mobile .number")
print mobiles
# Find a tag with a class of "number"
# that is an immediate descendent
# of a tag with "mobile"
mobiles = soup.select(".mobile > .number")
print mobiles
# Find an <a class=number> tag that is an immediate
# descendent of a <tr class=mobile> tag.
mobiles = soup.select("tr.mobile > a.number")
print mobiles