为什么just find没有错误,find_all却报错? (Python 美汤)
Why does find_all give an error even though there is no error in just find? (Python Beautiful Soup)
我正在尝试从 Billboard 前 100 名中获取歌曲的标题。
图片是他们的 html 脚本。
我写了这段代码:
from bs4 import BeautifulSoup
import urllib.request
url= 'http://www.billboard.com/charts/year-end/2015/hot-100-songs'
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page.read(), "html.parser")
songtitle = soup.find("div", {"class": "row-title"}).h2.contents
print(songtitle)
它检索第一个标题"UPTOWN FUNK!"
当我使用 find_all
时出现错误:
line 6, in <module>
songtitle = soup.find_all("div", {"class": "row-title"}).h2.contents
AttributeError: 'ResultSet' object has no attribute 'h2'
为什么它给我一个错误而不是给我所有的标题?完整的 html 脚本可以通过在 chrome 中使用 Control Shift J 找到,在此站点上:http://www.billboard.com/charts/year-end/2015/hot-100-songs
.find_all()
returns 一个 ResultSet
对象,它基本上是 Tag
个实例的列表 - 它没有 find()
方法。您需要遍历 find_all()
的结果并在每个标签上调用 find()
:
for item in soup.find_all("div", {"class": "row-title"}):
songtitle = item.h2.contents
print(songtitle)
或者,制作 CSS selector:
for title in soup.select("div.row-title h2"):
print(title.get_text())
对了,这个问题是covered in the documentation:
AttributeError: 'ResultSet' object has no attribute 'foo'
- This
usually happens because you expected find_all()
to return a single tag
or string. But find_all()
returns a list of tags and strings–a
ResultSet
object. You need to iterate over the list and look at the
.foo
of each one. Or, if you really only want one result, you need to
use find()
instead of find_all()
.
find_all
returns 总是一个列表。可以进行列表操作。
例如,
songtitle = soup.find_all("div", {"class": "row-title"})[0].get
print songtitle.get('h2')
songtitle = soup.find_all("div", {"class": "row-title"})[1].get
print songtitle.get('h2')
输出:
UPTOWN FUNK!
THINKING OUT LOUD
for item in soup.find_all("div", {"class": "row-title"}):
songtitle=item.get('h2')
print songtitle
我正在尝试从 Billboard 前 100 名中获取歌曲的标题。
图片是他们的 html 脚本。
我写了这段代码:
from bs4 import BeautifulSoup
import urllib.request
url= 'http://www.billboard.com/charts/year-end/2015/hot-100-songs'
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page.read(), "html.parser")
songtitle = soup.find("div", {"class": "row-title"}).h2.contents
print(songtitle)
它检索第一个标题"UPTOWN FUNK!"
当我使用 find_all
时出现错误:
line 6, in <module>
songtitle = soup.find_all("div", {"class": "row-title"}).h2.contents
AttributeError: 'ResultSet' object has no attribute 'h2'
为什么它给我一个错误而不是给我所有的标题?完整的 html 脚本可以通过在 chrome 中使用 Control Shift J 找到,在此站点上:http://www.billboard.com/charts/year-end/2015/hot-100-songs
.find_all()
returns 一个 ResultSet
对象,它基本上是 Tag
个实例的列表 - 它没有 find()
方法。您需要遍历 find_all()
的结果并在每个标签上调用 find()
:
for item in soup.find_all("div", {"class": "row-title"}):
songtitle = item.h2.contents
print(songtitle)
或者,制作 CSS selector:
for title in soup.select("div.row-title h2"):
print(title.get_text())
对了,这个问题是covered in the documentation:
AttributeError: 'ResultSet' object has no attribute 'foo'
- This usually happens because you expectedfind_all()
to return a single tag or string. Butfind_all()
returns a list of tags and strings–aResultSet
object. You need to iterate over the list and look at the.foo
of each one. Or, if you really only want one result, you need to usefind()
instead offind_all()
.
find_all
returns 总是一个列表。可以进行列表操作。
例如,
songtitle = soup.find_all("div", {"class": "row-title"})[0].get
print songtitle.get('h2')
songtitle = soup.find_all("div", {"class": "row-title"})[1].get
print songtitle.get('h2')
输出:
UPTOWN FUNK! THINKING OUT LOUD
for item in soup.find_all("div", {"class": "row-title"}):
songtitle=item.get('h2')
print songtitle