抓取 Youtube 频道最后一天的视频 - bs4
Scraping Youtube channel last day videos - bs4
我正在尝试进行抓取,它将 return 特定 YouTube 频道在特定日期使用 bs4 和请求上传的视频。
代码如下:
import requests
from bs4 import BeautifulSoup as bs
all_videos = requests.get('https://www.youtube.com/channel/UC16niRr50-MSBwiO3YDb3RA/videos')
soup = bs(all_videos.text, 'html.parser')
for video in soup.findAll('h3','yt-lockup-title'):
print(video)
输出为:
<h3 class="yt-lockup-title"><a aria-describedby="description-id-721031" class="yt-uix-sessionlink yt-uix-tile-link spf-link yt-ui-ellipsis yt-ui-ellipsis-2" data-sessionlink="ei=d1R3Xu7bOfTysAKmsY2IDg&feature=c4-videos-u" dir="ltr" href="/watch?v=ejzQApmABdM" rel="nofollow" title="Coronavirus: People in Beijing begin to head outdoors - BBC News">Coronavirus: People in Beijing begin to head outdoors - BBC News</a><span class="accessible-description" id="description-id-721031"> - Duration: 3 minutes, 8 seconds.</span></h3>
如何从此处提取标题、link 和上传日期?
试试这个:
title = soup.find('title')
link = soup.find('href')
或
soup.a.attrs()
你会得到所有属性的字典,其中title
和href
我正在尝试进行抓取,它将 return 特定 YouTube 频道在特定日期使用 bs4 和请求上传的视频。
代码如下:
import requests
from bs4 import BeautifulSoup as bs
all_videos = requests.get('https://www.youtube.com/channel/UC16niRr50-MSBwiO3YDb3RA/videos')
soup = bs(all_videos.text, 'html.parser')
for video in soup.findAll('h3','yt-lockup-title'):
print(video)
输出为:
<h3 class="yt-lockup-title"><a aria-describedby="description-id-721031" class="yt-uix-sessionlink yt-uix-tile-link spf-link yt-ui-ellipsis yt-ui-ellipsis-2" data-sessionlink="ei=d1R3Xu7bOfTysAKmsY2IDg&feature=c4-videos-u" dir="ltr" href="/watch?v=ejzQApmABdM" rel="nofollow" title="Coronavirus: People in Beijing begin to head outdoors - BBC News">Coronavirus: People in Beijing begin to head outdoors - BBC News</a><span class="accessible-description" id="description-id-721031"> - Duration: 3 minutes, 8 seconds.</span></h3>
如何从此处提取标题、link 和上传日期?
试试这个:
title = soup.find('title')
link = soup.find('href')
或
soup.a.attrs()
你会得到所有属性的字典,其中title
和href