如何从 span 元素中提取 br 文本?

How to extract br text from span element?

使用 Beautiful Soup v4,我有一个 span 如下:

<span style="color: grey;">32.44 MB<br/>10454 Downloads<br/>35:25 Mins<br/>128kbps Stereo</span>

我想单独提取 br 元素的文本。我该怎么做?

虽然这可能不是正确的方法,但如果您将跨度用作字符串,则可以像这样提取单词:

user_input = '<span style="color: grey;">32.44 MB<br/>10454 Downloads<br/>35:25 Mins<br/>128kbps Stereo</span>'.split( "<br/>" )
WordList = []
for word in user_input:
    if ">" in word:
        word = word[word.index(">")+1:]
    if word:
        WordList.append( [word] )
print(WordList)

试试这个:

from bs4 import BeautifulSoup

txt = '''<span style="color: grey;">32.44 MB<br/>10454 Downloads<br/>35:25 Mins<br/>128kbps Stereo</span>'''

soup = BeautifulSoup(txt, 'html.parser')

for tag in soup.select('span br'):
    print(tag.next)

输出:

10454 Downloads
35:25 Mins
128kbps Stereo