如何在 python 中使用 beautifulsoup 提取 href 内容
How to extract href content using beautifulsoup in python
import requests
from bs4 import BeautifulSoup
page = requests.get('http://espn.go.com/nba/team/roster/_/name/atl/atlanta-hawks')
soup = BeautifulSoup(page.content, "html.parser")
player_list = soup.find_all(class_="Image__Wrapper")
#player_list = soup.find_all("tr")
print(player_list[1])
我得到的输出是
<div class="Image__Wrapper aspect-ratio--child"><img alt="https://a.espncdn.com/i/headshots/nba/players/full/3062667.png" class="" data-mptype="image" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" title="DeAndre' Bembry"/></div>
我只对 DeAndre' Bembry 感兴趣,我该如何提取它。我也有点困惑如何获取所有玩家姓名的列表。
你可以试试
import requests
from bs4 import BeautifulSoup
page = requests.get('http://espn.go.com/nba/team/roster/_/name/atl/atlanta-hawks')
soup = BeautifulSoup(page.content, "html.parser")
player_list = soup.find_all(class_="Image__Wrapper")
#player_list = soup.find_all("tr")
print(player_list[1].img["title"])
输出
DeAndre' Bembry
并打印所有玩家
print([i.img["title"] for i in player_list if 0 < i.img["title"].count(" ") <= 3])
输出
["DeAndre' Bembry", 'Charlie Brown Jr.', 'Clint Capela', 'Vince Carter', 'John Collins', 'Dewayne Dedmon', 'Bruno Fernando', 'Brandon Goodwin', 'Treveon Graham', 'Kevin Huerter', "De'Andre Hunter", 'Damian Jones', 'Skal Labissiere', 'Cam Reddish', 'Jeff Teague', 'Trae Young']
player_list[1].find_next('img').get('title') # "DeAndre' Bembry"
import requests
from bs4 import BeautifulSoup
page = requests.get('http://espn.go.com/nba/team/roster/_/name/atl/atlanta-hawks')
soup = BeautifulSoup(page.content, "html.parser")
player_list = soup.find_all(class_="Image__Wrapper")
#player_list = soup.find_all("tr")
print(player_list[1])
我得到的输出是
<div class="Image__Wrapper aspect-ratio--child"><img alt="https://a.espncdn.com/i/headshots/nba/players/full/3062667.png" class="" data-mptype="image" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" title="DeAndre' Bembry"/></div>
我只对 DeAndre' Bembry 感兴趣,我该如何提取它。我也有点困惑如何获取所有玩家姓名的列表。
你可以试试
import requests
from bs4 import BeautifulSoup
page = requests.get('http://espn.go.com/nba/team/roster/_/name/atl/atlanta-hawks')
soup = BeautifulSoup(page.content, "html.parser")
player_list = soup.find_all(class_="Image__Wrapper")
#player_list = soup.find_all("tr")
print(player_list[1].img["title"])
输出
DeAndre' Bembry
并打印所有玩家
print([i.img["title"] for i in player_list if 0 < i.img["title"].count(" ") <= 3])
输出
["DeAndre' Bembry", 'Charlie Brown Jr.', 'Clint Capela', 'Vince Carter', 'John Collins', 'Dewayne Dedmon', 'Bruno Fernando', 'Brandon Goodwin', 'Treveon Graham', 'Kevin Huerter', "De'Andre Hunter", 'Damian Jones', 'Skal Labissiere', 'Cam Reddish', 'Jeff Teague', 'Trae Young']
player_list[1].find_next('img').get('title') # "DeAndre' Bembry"