如何从 span 元素中提取 br 文本?
How to extract br text from span element?
使用 Beautiful Soup v4,我有一个 span
如下:
<span style="color: grey;">32.44 MB<br/>10454 Downloads<br/>35:25 Mins<br/>128kbps Stereo</span>
我想单独提取 br
元素的文本。我该怎么做?
虽然这可能不是正确的方法,但如果您将跨度用作字符串,则可以像这样提取单词:
user_input = '<span style="color: grey;">32.44 MB<br/>10454 Downloads<br/>35:25 Mins<br/>128kbps Stereo</span>'.split( "<br/>" )
WordList = []
for word in user_input:
if ">" in word:
word = word[word.index(">")+1:]
if word:
WordList.append( [word] )
print(WordList)
试试这个:
from bs4 import BeautifulSoup
txt = '''<span style="color: grey;">32.44 MB<br/>10454 Downloads<br/>35:25 Mins<br/>128kbps Stereo</span>'''
soup = BeautifulSoup(txt, 'html.parser')
for tag in soup.select('span br'):
print(tag.next)
输出:
10454 Downloads
35:25 Mins
128kbps Stereo
使用 Beautiful Soup v4,我有一个 span
如下:
<span style="color: grey;">32.44 MB<br/>10454 Downloads<br/>35:25 Mins<br/>128kbps Stereo</span>
我想单独提取 br
元素的文本。我该怎么做?
虽然这可能不是正确的方法,但如果您将跨度用作字符串,则可以像这样提取单词:
user_input = '<span style="color: grey;">32.44 MB<br/>10454 Downloads<br/>35:25 Mins<br/>128kbps Stereo</span>'.split( "<br/>" )
WordList = []
for word in user_input:
if ">" in word:
word = word[word.index(">")+1:]
if word:
WordList.append( [word] )
print(WordList)
试试这个:
from bs4 import BeautifulSoup
txt = '''<span style="color: grey;">32.44 MB<br/>10454 Downloads<br/>35:25 Mins<br/>128kbps Stereo</span>'''
soup = BeautifulSoup(txt, 'html.parser')
for tag in soup.select('span br'):
print(tag.next)
输出:
10454 Downloads
35:25 Mins
128kbps Stereo