AttributeError: 'NoneType' object has no attribute when I try to get values out of a table

Question

我尝试抓取网站并使用 Python 从 table 中获取值。一切顺利，直到我只想获取值（因此没有 html）。

我尝试使用以下代码从字段中获取值：

from bs4 import BeautifulSoup
from urllib.request import Request, urlopen
import requests

req = Request('https://www.formula1.com/en/results.html/2022/drivers.html', headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()


soup = BeautifulSoup(webpage,'html.parser')


drivers = soup.find('table',class_='resultsarchive-table').find_all('tr')

for driver in drivers:
    rank = driver.find('td', class_='dark')
    first = driver.find('span',class_='hide-for-tablet')
    last = driver.find('span',class_='hide-for-mobile')
    print (rank)

当我使用 .text 或 .get_text() 时，我收到错误 AttributeError: 'NoneType' object has no attribute while the code above contains values.

我做错了什么？

Answer 1

这里的问题是您还抓取 table headers 不包含任何 <td> 的行。但你可以简单地将它们切片：

for driver in drivers[1:]:
    rank = driver.find('td', class_='dark').text
    first = driver.find('span',class_='hide-for-tablet').text
    last = driver.find('span',class_='hide-for-mobile').text
    print (rank)

或 select 更具体，例如 css selectors:

drivers = soup.select('table.resultsarchive-table tr:has(td)')

for driver in drivers:
    rank = driver.find('td', class_='dark').text
    first = driver.find('span',class_='hide-for-tablet').text
    last = driver.find('span',class_='hide-for-mobile').text
    print (rank)

AttributeError: 'NoneType' object has no attribute when I try to get values out of a table

AttributeError: 'NoneType' object has no attribute when I try to get values out of a table

python

beautifulsoup

web-scraping

python-requests