尝试抓取包含多个数据 table 的网页,但只提取了第一个 table?

Trying to scrape a webpage with multiple data tables, however only the first table is being extracted?

我正在尝试从 Basketball-Reference 中提取篮球运动员的数据,用于我正在进行的项目。在 B-R 上,播放器页面有多个 table 数据,我想抓取所有数据。但是,当我尝试从页面中获取 tables 时,它只会给我第一个 table 标签实例,即只有第一个 table.

我搜索了 html,发现在 table 标签的第一个实例之外,所有 table 标签都在注释块下。当我解析他们的父标签并尝试搜索包含 table 信息的子标签时,它 return 什么都没有。 Here is a link to an example page,这是我的代码:

url = 'https://www.basketball-reference.com/players/j/jamesle01.html'
get = requests.get(url)
soup = BeautifulSoup(get.text, 'html.parser')

per_36 = soup.find(id='all_per_minute')
table = per_36.find('table')

这 return 没什么,但是,如果我要查找第一个 table,它会 return 内容。我不明白这是怎么回事,但我认为这可能与那些评论块有关?

要通过 BeautifulSoup 抓取评论,您可以使用此脚本:

import requests
from bs4 import BeautifulSoup, Comment

url = 'https://www.basketball-reference.com/players/j/jamesle01.html'
get = requests.get(url)
soup = BeautifulSoup(get.text, 'html.parser')

pl = soup.select_one('#all_per_minute .placeholder')
comments = pl.find_next(string=lambda text: isinstance(text, Comment))

soup = BeautifulSoup(comments, 'html.parser')

rows = []
for tr in soup.select('tr'):
    rows.append([td.get_text(strip=True) for td in tr.select('td, th')])

for row in rows:
    print(''.join('{: ^7}'.format(td) for td in row))

打印:

Season   Age    Tm     Lg     Pos     G     GS     MP     FG     FGA    FG%    3P     3PA    3P%    2P     2PA    2P%    FT     FTA    FT%    ORB    DRB    TRB    AST    STL    BLK    TOV    PF     PTS  
2003-04  19     CLE    NBA    SG     79     79    3122    7.2   17.2   .417    0.7    2.5   .290    6.4   14.7   .438    4.0    5.3   .754    1.1    3.8    5.0    5.4    1.5    0.7    3.1    1.7   19.1  
2004-05  20     CLE    NBA    SF     80     80    3388    8.4   17.9   .472    1.1    3.3   .351    7.3   14.6   .499    5.1    6.8   .750    1.2    5.1    6.2    6.1    1.9    0.6    2.8    1.6   23.1  
2005-06  21     CLE    NBA    SF     79     79    3361    9.4   19.5   .480    1.4    4.1   .335    8.0   15.5   .518    6.4    8.7   .738    0.8    5.2    6.0    5.6    1.3    0.7    2.8    1.9   26.5  
2006-07  22     CLE    NBA    SF     78     78    3190    8.7   18.3   .476    1.1    3.5   .319    7.6   14.8   .513    5.5    7.9   .698    0.9    5.0    5.9    5.3    1.4    0.6    2.8    1.9   24.1  
2007-08  23     CLE    NBA    SF     75     74    3027    9.4   19.5   .484    1.3    4.3   .315    8.1   15.3   .531    6.5    9.2   .712    1.6    5.5    7.0    6.4    1.6    1.0    3.0    2.0   26.8  
2008-09  24     CLE    NBA    SF     81     81    3054    9.3   19.0   .489    1.6    4.5   .344    7.7   14.5   .535    7.0    9.0   .780    1.2    6.0    7.2    6.9    1.6    1.1    2.8    1.6   27.2  
2009-10  25     CLE    NBA    SF     76     76    2966    9.3   18.5   .503    1.6    4.7   .333    7.8   13.8   .560    7.2    9.4   .767    0.9    5.9    6.7    7.9    1.5    0.9    3.2    1.4   27.4  
2010-11  26     MIA    NBA    SF     79     79    3063    8.9   17.5   .510    1.1    3.3   .330    7.8   14.2   .552    5.9    7.8   .759    0.9    6.0    6.9    6.5    1.5    0.6    3.3    1.9   24.8  
2011-12  27     MIA    NBA    SF     62     62    2326    9.6   18.1   .531    0.8    2.3   .362    8.8   15.8   .556    6.0    7.8   .771    1.5    6.2    7.6    6.0    1.8    0.8    3.3    1.5   26.0  

...and so on.