使用 Python 分析 YouTube 评论 -- 参数已禁用评论
Analysing YouTube comments using Python -- parameter has disabled comments
我正在尝试使用 YouTube 评论进行文本分析。我一直在使用来自以下网站的代码来抓取 YouTube:
https://www.pingshiuanchua.com/blog/post/using-youtube-api-to-analyse-youtube-comments-on-python
脚本开始运行,但有一段代码会在评论被禁用时生成错误,而且我无法找到一种方法来检查评论是否被禁用或评论是否存在,并且如果没有评论可抓取,则跳过该视频,并继续观看下一个视频。
产生错误的相关代码块是:
# =============================================================================
# Get Comments of Top Videos
# =============================================================================
video_id_pop = []
channel_pop = []
video_title_pop = []
video_desc_pop = []
comments_pop = []
comment_id_pop = []
reply_count_pop = []
like_count_pop = []
from tqdm import tqdm
for i, video in enumerate(tqdm(video_id, ncols = 100)):
response = service.commentThreads().list(
part = 'snippet',
videoId = video,
maxResults = 100, # Only take top 100 comments...
order = 'relevance', #... ranked on relevance
textFormat = 'plainText',
).execute()
comments_temp = []
comment_id_temp = []
reply_count_temp = []
like_count_temp = []
for item in response['items']:
comments_temp.append(item['snippet']['topLevelComment']['snippet']['textDisplay'])
comment_id_temp.append(item['snippet']['topLevelComment']['id'])
reply_count_temp.append(item['snippet']['totalReplyCount'])
like_count_temp.append(item['snippet']['topLevelComment']['snippet']['likeCount'])
comments_pop.extend(comments_temp)
comment_id_pop.extend(comment_id_temp)
reply_count_pop.extend(reply_count_temp)
like_count_pop.extend(like_count_temp)
video_id_pop.extend([video_id[i]]*len(comments_temp))
channel_pop.extend([channel[i]]*len(comments_temp))
video_title_pop.extend([video_title[i]]*len(comments_temp))
video_desc_pop.extend([video_desc[i]]*len(comments_temp))
query_pop = [query] * len(video_id_pop)
编辑添加:
创建代码的人留言修复错误说:
"您可以将代码的查询部分包装在 try...except 语句中,如果 try 语句(查询部分)失败,您可以将空白响应或“错误”字符串除外列表。
如果对其他人有意义,我有 NFI 如何执行此操作...
注意:这不一定是“好的”编码风格,但如果我在为自己 short-term 编写脚本时 运行 遇到这个问题,我就会这样做, 个人使用。
Python(以及许多其他语言)有一种方法可以捕获异常并在不崩溃的情况下处理它们。如果使用得当,这可能是处理不良数据的一种非常好的方法。
https://docs.python.org/3.8/tutorial/errors.html 是对异常的很好的概述。一般来说,他们采用的格式类似于
try:
code_that_can_error()
except ExceptionThatWIllBeThrown as ex:
handle_exception()
print(ex) # ex is an object that has information about what went wrong
finally:
clean_up()
(如果你有一些东西需要关闭,比如一个文件,finally 特别有用。如果抛出异常,你可能不会关闭它,但是 finally 是 gua运行teed 得到调用,即使抛出异常。)
对于您的情况,我们只需要忽略错误并继续播放下一个视频。
for i, video in enumerate(tqdm(video_id, ncols = 100)):
try:
response = service.commentThreads().list(
part = 'snippet',
videoId = video,
maxResults = 100, # Only take top 100 comments...
order = 'relevance', #... ranked on relevance
textFormat = 'plainText',
).execute()
comments_temp = []
[...]
video_desc_pop.extend([video_desc[i]]*len(comments_temp))
except:
# Something threw an error. Skip that video and move on
print(f"{video} has comments disabled, or something else went wrong")
query_pop = [query] * len(video_id_pop)
我正在尝试使用 YouTube 评论进行文本分析。我一直在使用来自以下网站的代码来抓取 YouTube:
https://www.pingshiuanchua.com/blog/post/using-youtube-api-to-analyse-youtube-comments-on-python
脚本开始运行,但有一段代码会在评论被禁用时生成错误,而且我无法找到一种方法来检查评论是否被禁用或评论是否存在,并且如果没有评论可抓取,则跳过该视频,并继续观看下一个视频。
产生错误的相关代码块是:
# =============================================================================
# Get Comments of Top Videos
# =============================================================================
video_id_pop = []
channel_pop = []
video_title_pop = []
video_desc_pop = []
comments_pop = []
comment_id_pop = []
reply_count_pop = []
like_count_pop = []
from tqdm import tqdm
for i, video in enumerate(tqdm(video_id, ncols = 100)):
response = service.commentThreads().list(
part = 'snippet',
videoId = video,
maxResults = 100, # Only take top 100 comments...
order = 'relevance', #... ranked on relevance
textFormat = 'plainText',
).execute()
comments_temp = []
comment_id_temp = []
reply_count_temp = []
like_count_temp = []
for item in response['items']:
comments_temp.append(item['snippet']['topLevelComment']['snippet']['textDisplay'])
comment_id_temp.append(item['snippet']['topLevelComment']['id'])
reply_count_temp.append(item['snippet']['totalReplyCount'])
like_count_temp.append(item['snippet']['topLevelComment']['snippet']['likeCount'])
comments_pop.extend(comments_temp)
comment_id_pop.extend(comment_id_temp)
reply_count_pop.extend(reply_count_temp)
like_count_pop.extend(like_count_temp)
video_id_pop.extend([video_id[i]]*len(comments_temp))
channel_pop.extend([channel[i]]*len(comments_temp))
video_title_pop.extend([video_title[i]]*len(comments_temp))
video_desc_pop.extend([video_desc[i]]*len(comments_temp))
query_pop = [query] * len(video_id_pop)
编辑添加:
创建代码的人留言修复错误说:
"您可以将代码的查询部分包装在 try...except 语句中,如果 try 语句(查询部分)失败,您可以将空白响应或“错误”字符串除外列表。
如果对其他人有意义,我有 NFI 如何执行此操作...
注意:这不一定是“好的”编码风格,但如果我在为自己 short-term 编写脚本时 运行 遇到这个问题,我就会这样做, 个人使用。
Python(以及许多其他语言)有一种方法可以捕获异常并在不崩溃的情况下处理它们。如果使用得当,这可能是处理不良数据的一种非常好的方法。
https://docs.python.org/3.8/tutorial/errors.html 是对异常的很好的概述。一般来说,他们采用的格式类似于
try:
code_that_can_error()
except ExceptionThatWIllBeThrown as ex:
handle_exception()
print(ex) # ex is an object that has information about what went wrong
finally:
clean_up()
(如果你有一些东西需要关闭,比如一个文件,finally 特别有用。如果抛出异常,你可能不会关闭它,但是 finally 是 gua运行teed 得到调用,即使抛出异常。)
对于您的情况,我们只需要忽略错误并继续播放下一个视频。
for i, video in enumerate(tqdm(video_id, ncols = 100)):
try:
response = service.commentThreads().list(
part = 'snippet',
videoId = video,
maxResults = 100, # Only take top 100 comments...
order = 'relevance', #... ranked on relevance
textFormat = 'plainText',
).execute()
comments_temp = []
[...]
video_desc_pop.extend([video_desc[i]]*len(comments_temp))
except:
# Something threw an error. Skip that video and move on
print(f"{video} has comments disabled, or something else went wrong")
query_pop = [query] * len(video_id_pop)