使用 BeautifulSoup 网页抓取 ID 为 CSS 的标签
Using BeautifulSoup to Web Scrape Tags with CSS IDs
我正在尝试通过网络抓取该网站以查找 ID =“2004.advanced”(存在)的标签。这是我试过的三行代码。
webpage = requests.get('https://www.basketball-reference.com/players/j/jamesle01.html')
soup = BeautifulSoup(webpage.content, 'html.parser')
print(soup.find_all( attrs = {'id': 'advanced.2004'}))
在此先感谢您的帮助!
问题是您要查找的元素在评论中。要解决此问题,请尝试遍历页面上的每个评论,使用 BeautifulSoup
解析其内容并搜索您想要的元素:
import requests
from bs4 import BeautifulSoup, Comment
url = 'https://www.basketball-reference.com/players/j/jamesle01.html'
webpage = requests.get(url)
soup = BeautifulSoup(webpage.content, 'html.parser')
for comment in soup.find_all(text=lambda el:isinstance(el, Comment)):
comment_html = BeautifulSoup(comment, 'html.parser')
el = comment_html.find(id='advanced.2004')
if el != None: break
print(el)
我正在尝试通过网络抓取该网站以查找 ID =“2004.advanced”(存在)的标签。这是我试过的三行代码。
webpage = requests.get('https://www.basketball-reference.com/players/j/jamesle01.html')
soup = BeautifulSoup(webpage.content, 'html.parser')
print(soup.find_all( attrs = {'id': 'advanced.2004'}))
在此先感谢您的帮助!
问题是您要查找的元素在评论中。要解决此问题,请尝试遍历页面上的每个评论,使用 BeautifulSoup
解析其内容并搜索您想要的元素:
import requests
from bs4 import BeautifulSoup, Comment
url = 'https://www.basketball-reference.com/players/j/jamesle01.html'
webpage = requests.get(url)
soup = BeautifulSoup(webpage.content, 'html.parser')
for comment in soup.find_all(text=lambda el:isinstance(el, Comment)):
comment_html = BeautifulSoup(comment, 'html.parser')
el = comment_html.find(id='advanced.2004')
if el != None: break
print(el)