使用 BeautifulSoup 访问 html 中的文本

Question

我正在尝试使用 BeautifulSoup 访问字符串 Out of Stock 但无法找到它的方法:

<span style="color: #727272; font-size: 14px; font-weight: normal;">
    <strong>Price: 0</strong>
     (Out of stock)
</span>

任何人都可以提示我该怎么做吗？

Answer 1

使用.next_sibling attribute获取<strong>标签后的元素：

span.strong.next_sibling

该字符串周围可能有多余的空格，因此您可以使用 str.strip() 来清理它。

演示：

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('''\
... <span style="color: #727272; font-size: 14px; font-weight: normal;">
...     <strong>Price: 0</strong>
...      (Out of stock)
... </span>
... ''')
>>> soup.span.strong
<strong>Price: 0</strong>
>>> soup.span.strong.next_sibling
u'\n     (Out of stock)\n'
>>> soup.span.strong.next_sibling.strip()
u'(Out of stock)'

Answer 2

import bs4
soup = bs4.BeautifulSoup(html_text)
soup.get_text().split('\n')[2].strip()

使用 BeautifulSoup 访问 html 中的文本

Accessing text in html using BeautifulSoup

python

beautifulsoup