Python:获取数据BeautifulSoup
Python: Get data BeautifulSoup
我需要 BeautifulSoup 方面的帮助,我正在尝试获取数据:
<font face="arial" font-size="16px" color="navy">001970000521</font>
它们很多,我需要获取里面的值"font"
<div id="accounts" class="elementoOculto">
<table align="center" border="0" cellspacing=0 width="90%"> <tr><th align="left" colspan=2> permisos </th></tr><tr>
<td colspan=2>
<table width=100% align=center border=0 cellspacing=1>
<tr>
<th align=center width="20%">cuen</th>
<th align=center>Mods</th>
</tr>
</table>
</td>
</tr>
</table>
<table align="center" border="0" cellspacing=1 width="90%">
<tr bgcolor="whitesmoke" height="08">
<td align="left" width="20%">
<font face="arial" font-size="16px" color="navy">001970000521</font>
</td>
<td>......
<table align="center" border="0" cellspacing=1 width="90%">
<tr bgcolor="whitesmoke" height="08">
<td align="left" width="20%">
<font face="arial" font-size="16px" color="navy">001970000521</font>
</td>
希望大家能帮帮我,谢谢
如何使用从 div
开始的 CSS selector 和 id="accounts"
:
soup.select("div#accounts table > tr > font")
您应该使用 bs4.Tag.find_all
方法或类似方法。
soup.find_all(attrs={"face":"arial","font-size":"16px","color":"navy"})
示例:
>>>import bs4
>>>html='''<div id="accounts" class="elementoOculto"> <table align="center" border="0" cellspacing=0 width="90%"> <tr><th align="left" colspan=2> permisos </th></tr><tr> <td colspan=2> <table width=100% align=center border=0 cellspacing=1> <tr> <th align=center width="20%">cuen</th> <th align=center>Mods</th> </tr> </table> </td> </tr> </table> <table align="center" border="0" cellspacing=1 width="90%"> <tr bgcolor="whitesmoke" height="08"> <td align="left" width="20%"> <font face="arial" font-size="16px" color="navy">001970000521</font> </td> <td>...... <table align="center" border="0" cellspacing=1 width="90%"> <tr bgcolor="whitesmoke" height="08"> <td align="left" width="20%"> <font face="arial" font-size="16px" color="navy">001970000521</font> </td> '''
>>>print bs4.BeautifulSoup(html).find_all(attrs={"face":"arial","font-size":"16px","color":"navy"})
[<font color="navy" face="arial" font-size="16px">001970000521</font>, <font color="navy" face="arial" font-size="16px">001970000521</font>]
这个怎么样?
from bs4 import BeautifulSoup
str = '''<div id="accounts" class="elementoOculto">
<table align="center" border="0" cellspacing=0 width="90%"> <tr><th align="left" colspan=2> permisos </th></tr><tr>
<td colspan=2>
<table width=100% align=center border=0 cellspacing=1>
<tr>
<th align=center width="20%">cuen</th>
<th align=center>Mods</th>
</tr>
</table>
</td>
</tr>
</table>
<table align="center" border="0" cellspacing=1 width="90%">
<tr bgcolor="whitesmoke" height="08">
<td align="left" width="20%">
<font face="arial" font-size="16px" color="navy">001970000521</font>
</td>
<td>......
<table align="center" border="0" cellspacing=1 width="90%">
<tr bgcolor="whitesmoke" height="08">
<td align="left" width="20%">
<font face="arial" font-size="16px" color="navy">001970000521</font>
</td>'''
bs = BeautifulSoup(str)
print bs.font.string
我需要 BeautifulSoup 方面的帮助,我正在尝试获取数据:
<font face="arial" font-size="16px" color="navy">001970000521</font>
它们很多,我需要获取里面的值"font"
<div id="accounts" class="elementoOculto">
<table align="center" border="0" cellspacing=0 width="90%"> <tr><th align="left" colspan=2> permisos </th></tr><tr>
<td colspan=2>
<table width=100% align=center border=0 cellspacing=1>
<tr>
<th align=center width="20%">cuen</th>
<th align=center>Mods</th>
</tr>
</table>
</td>
</tr>
</table>
<table align="center" border="0" cellspacing=1 width="90%">
<tr bgcolor="whitesmoke" height="08">
<td align="left" width="20%">
<font face="arial" font-size="16px" color="navy">001970000521</font>
</td>
<td>......
<table align="center" border="0" cellspacing=1 width="90%">
<tr bgcolor="whitesmoke" height="08">
<td align="left" width="20%">
<font face="arial" font-size="16px" color="navy">001970000521</font>
</td>
希望大家能帮帮我,谢谢
如何使用从 div
开始的 CSS selector 和 id="accounts"
:
soup.select("div#accounts table > tr > font")
您应该使用 bs4.Tag.find_all
方法或类似方法。
soup.find_all(attrs={"face":"arial","font-size":"16px","color":"navy"})
示例:
>>>import bs4
>>>html='''<div id="accounts" class="elementoOculto"> <table align="center" border="0" cellspacing=0 width="90%"> <tr><th align="left" colspan=2> permisos </th></tr><tr> <td colspan=2> <table width=100% align=center border=0 cellspacing=1> <tr> <th align=center width="20%">cuen</th> <th align=center>Mods</th> </tr> </table> </td> </tr> </table> <table align="center" border="0" cellspacing=1 width="90%"> <tr bgcolor="whitesmoke" height="08"> <td align="left" width="20%"> <font face="arial" font-size="16px" color="navy">001970000521</font> </td> <td>...... <table align="center" border="0" cellspacing=1 width="90%"> <tr bgcolor="whitesmoke" height="08"> <td align="left" width="20%"> <font face="arial" font-size="16px" color="navy">001970000521</font> </td> '''
>>>print bs4.BeautifulSoup(html).find_all(attrs={"face":"arial","font-size":"16px","color":"navy"})
[<font color="navy" face="arial" font-size="16px">001970000521</font>, <font color="navy" face="arial" font-size="16px">001970000521</font>]
这个怎么样?
from bs4 import BeautifulSoup
str = '''<div id="accounts" class="elementoOculto">
<table align="center" border="0" cellspacing=0 width="90%"> <tr><th align="left" colspan=2> permisos </th></tr><tr>
<td colspan=2>
<table width=100% align=center border=0 cellspacing=1>
<tr>
<th align=center width="20%">cuen</th>
<th align=center>Mods</th>
</tr>
</table>
</td>
</tr>
</table>
<table align="center" border="0" cellspacing=1 width="90%">
<tr bgcolor="whitesmoke" height="08">
<td align="left" width="20%">
<font face="arial" font-size="16px" color="navy">001970000521</font>
</td>
<td>......
<table align="center" border="0" cellspacing=1 width="90%">
<tr bgcolor="whitesmoke" height="08">
<td align="left" width="20%">
<font face="arial" font-size="16px" color="navy">001970000521</font>
</td>'''
bs = BeautifulSoup(str)
print bs.font.string