网页抓取 Youtube 页面
Webscraping Youtube pages
我正在尝试通过 link 网络抓取 YouTube 频道名称。但我得到错误代码:
title = response.find_all('div', class_= "style-scope ytd-channel-name")
AttributeError: 'Response' object has no attribute 'find_all'
Link 到站点:https://www.youtube.com/channel/UCHOgE8XeaCjlgvH0t01fVZg
代码:
url = 'https://www.youtube.com/channel/UCHOgE8XeaCjlgvH0t01fVZg'
response = requests.get(url)
title = response.find_all('div', class_= "style-scope ytd-channel-name")
soup = BeautifulSoup(title.text, 'lxml')
print(soup)
谢谢!
以下代码returns div:
url = "https://www.youtube.com/channel/UCHOgE8XeaCjlgvH0t01fVZg"
req = requests.get(url)
soup = BeautifulSoup(req.text, "html.parser")
print(soup.div)
返回的值可以通过 'soup.' 值更改(例如 soup.title)。
我 link 阅读了文档,因为我认为您也可以查看它:
https://www.crummy.com/software/BeautifulSoup/bs4/doc/#
我们可以用这个。
from requests_html import HTMLSession
from bs4 import BeautifulSoup as bs # importing BeautifulSoup
video_url = "https://www.youtube.com/channel/UCHOgE8XeaCjlgvH0t01fVZg"
# init an HTML Session
session = HTMLSession()
# get the html content
response = session.get(video_url)
# execute Java-script
response.html.render(sleep=1)
# create bs object to parse HTML
soup = bs(response.html.html, "html.parser")
name = soup.find('yt-formatted-string', class_='style-scope ytd-channel-name')
print(name.text)
输出:-
TheTekkitRealm
我正在尝试通过 link 网络抓取 YouTube 频道名称。但我得到错误代码:
title = response.find_all('div', class_= "style-scope ytd-channel-name")
AttributeError: 'Response' object has no attribute 'find_all'
Link 到站点:https://www.youtube.com/channel/UCHOgE8XeaCjlgvH0t01fVZg
代码:
url = 'https://www.youtube.com/channel/UCHOgE8XeaCjlgvH0t01fVZg'
response = requests.get(url)
title = response.find_all('div', class_= "style-scope ytd-channel-name")
soup = BeautifulSoup(title.text, 'lxml')
print(soup)
谢谢!
以下代码returns div:
url = "https://www.youtube.com/channel/UCHOgE8XeaCjlgvH0t01fVZg"
req = requests.get(url)
soup = BeautifulSoup(req.text, "html.parser")
print(soup.div)
返回的值可以通过 'soup.' 值更改(例如 soup.title)。
我 link 阅读了文档,因为我认为您也可以查看它: https://www.crummy.com/software/BeautifulSoup/bs4/doc/#
我们可以用这个。
from requests_html import HTMLSession
from bs4 import BeautifulSoup as bs # importing BeautifulSoup
video_url = "https://www.youtube.com/channel/UCHOgE8XeaCjlgvH0t01fVZg"
# init an HTML Session
session = HTMLSession()
# get the html content
response = session.get(video_url)
# execute Java-script
response.html.render(sleep=1)
# create bs object to parse HTML
soup = bs(response.html.html, "html.parser")
name = soup.find('yt-formatted-string', class_='style-scope ytd-channel-name')
print(name.text)
输出:-
TheTekkitRealm