Beautifulsoup return 属性列表 "class" 而其他属性的值
Beautifulsoup return list for attribute "class" while value for other attribute
Beautifulsoup 方便 html 在 python 中解析,下面的代码结果 cofuse me.
from bs4 import BeautifulSoup
tr ="""
<table>
<tr class="passed" id="row1"><td>t1</td></tr>
<tr class="failed" id="row2"><td>t2</td></tr>
</table>
"""
table = BeautifulSoup(tr,"html.parser")
for row in table.findAll("tr"):
print row["class"]
print row["id"]
结果:
[u'passed']
row1
[u'failed']
row2
为什么属性 class
returns 为数组?而 id
是正常值 ?
beautifulsoup4-4.5.0
与 python 2.7
一起使用
class
is a special multi-valued attribute 在 BeautifulSoup
:
HTML 4 defines a few attributes that can have multiple values. HTML 5
removes a couple of them, but defines a few more. The most common
multi-valued attribute is class
(that is, a tag can have more than one
CSS class)
有时,这是一个难以处理的问题 - 例如,当您想要将正则表达式应用于整个 class
属性值时:
你可以,但我不建议这样做。
因为元素可能有多个类.
考虑这个例子:
从 bs4 导入 BeautifulSoup
tr ="""
<table>
<tr class="passed a b c" id="row1"><td>t1</td></tr>
<tr class="failed" id="row2"><td>t2</td></tr>
</table>
"""
table = BeautifulSoup(tr,"html.parser")
for row in table.findAll("tr"):
print row["class"]
print row["id"]
['passed', 'a', 'b', 'c']
row1
['failed']
row2
Beautifulsoup 方便 html 在 python 中解析,下面的代码结果 cofuse me.
from bs4 import BeautifulSoup
tr ="""
<table>
<tr class="passed" id="row1"><td>t1</td></tr>
<tr class="failed" id="row2"><td>t2</td></tr>
</table>
"""
table = BeautifulSoup(tr,"html.parser")
for row in table.findAll("tr"):
print row["class"]
print row["id"]
结果:
[u'passed']
row1
[u'failed']
row2
为什么属性 class
returns 为数组?而 id
是正常值 ?
beautifulsoup4-4.5.0
与 python 2.7
class
is a special multi-valued attribute 在 BeautifulSoup
:
HTML 4 defines a few attributes that can have multiple values. HTML 5 removes a couple of them, but defines a few more. The most common multi-valued attribute is
class
(that is, a tag can have more than one CSS class)
有时,这是一个难以处理的问题 - 例如,当您想要将正则表达式应用于整个 class
属性值时:
你可以
因为元素可能有多个类.
考虑这个例子:
从 bs4 导入 BeautifulSoup
tr ="""
<table>
<tr class="passed a b c" id="row1"><td>t1</td></tr>
<tr class="failed" id="row2"><td>t2</td></tr>
</table>
"""
table = BeautifulSoup(tr,"html.parser")
for row in table.findAll("tr"):
print row["class"]
print row["id"]
['passed', 'a', 'b', 'c']
row1
['failed']
row2