为什么just find没有错误，find_all却报错？ (Python 美汤)

Question

我正在尝试从 Billboard 前 100 名中获取歌曲的标题。图片是他们的 html 脚本。

我写了这段代码：

from bs4 import BeautifulSoup
import urllib.request

url= 'http://www.billboard.com/charts/year-end/2015/hot-100-songs'
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page.read(), "html.parser")
songtitle = soup.find("div", {"class": "row-title"}).h2.contents
print(songtitle)

它检索第一个标题"UPTOWN FUNK!"
当我使用 find_all 时出现错误：

line 6, in <module>
songtitle = soup.find_all("div", {"class": "row-title"}).h2.contents
AttributeError: 'ResultSet' object has no attribute 'h2'

为什么它给我一个错误而不是给我所有的标题？完整的 html 脚本可以通过在 chrome 中使用 Control Shift J 找到，在此站点上：http://www.billboard.com/charts/year-end/2015/hot-100-songs

Answer 1

.find_all() returns 一个 ResultSet 对象，它基本上是 Tag 个实例的列表 - 它没有 find() 方法。您需要遍历 find_all() 的结果并在每个标签上调用 find()：

for item in soup.find_all("div", {"class": "row-title"}):
    songtitle = item.h2.contents
    print(songtitle)

或者，制作 CSS selector:

for title in soup.select("div.row-title h2"):
    print(title.get_text())

对了，这个问题是covered in the documentation:

AttributeError: 'ResultSet' object has no attribute 'foo' - This usually happens because you expected find_all() to return a single tag or string. But find_all() returns a list of tags and strings–a ResultSet object. You need to iterate over the list and look at the .foo of each one. Or, if you really only want one result, you need to use find() instead of find_all().

Answer 2

find_all returns 总是一个列表。可以进行列表操作。

例如，

songtitle = soup.find_all("div", {"class": "row-title"})[0].get
print songtitle.get('h2')
songtitle = soup.find_all("div", {"class": "row-title"})[1].get
print songtitle.get('h2')

输出：

UPTOWN FUNK!
THINKING OUT LOUD

for item in soup.find_all("div", {"class": "row-title"}):
    songtitle=item.get('h2')
    print songtitle

为什么just find没有错误，find_all却报错？ (Python 美汤)

Why does find_all give an error even though there is no error in just find? (Python Beautiful Soup)

html

python

beautifulsoup

html-parsing

python-3.x