如何在 Tweepy 中检查推文中的图像使用
How to check for image use in tweets in Tweepy
我已经编写了代码来从用户列表 [handles] 中提取推文。我正在将信息写入名为 "results" 的 .txt 文件。
with open("results", "w") as fp:
for handle in handles:
print("Analyzing tweets from " + handle + "...")
user = api.get_user(id=handle)
fp.write("Handle: " + handle + "\n")
fp.write("Name: " + user.name + "\n")
fp.write("Description: " + str(user.description.encode(sys.stdout.encoding, errors='replace')) + "\n")
fp.write("Followers: " + str(user.followers_count) + "\n")
fp.write("Following: " + str(user.friends_count) + "\n")
tweet_counter = 0
prosocial_tweets_count = 0
regular_tweets_count = 0
all_tweets = []
social_tweets_len = []
regular_tweets_len = []
social_tweets_valence = []
regular_tweets_valence = []
regular_attachments = 0
social_attachments = 0
for tweet in tweepy.Cursor(api.user_timeline, id=user.id).items():
#control for timeline
dt = tweet.created_at
if dt > date_until:
continue
if dt < date_from:
break # XXX: I hope it's OK to break here
if include_retweets == "no" and tweet.text.startswith("RT"):
continue
if include_replies == "no" and tweet.in_reply_to_user_id:
continue
tweet_counter += 1
for word in vocabulary:
if word in tweet.text.lower():
#increase count of pro social tweets
prosocial_tweets_count += 1
#clean the tweet for valence analysis
clean = TextBlob(tweet.text.lower())
#calculate valence
valence = clean.sentiment.polarity
#append the valence to a list
social_tweets_valence.append(valence)
#append the length of the tweet to a list
social_tweets_len.append(len(tweet.text))
#check if there is an attachment
counting = tweet.text.lower()
counting_attachments = counting.count(" https://t.co/")
social_attachments = social_attachments + counting_attachments
#write date
fp.write(" * " + str(dt) + "\n")
#write the tweet
fp.write(" " + str(tweet.text.encode(sys.stdout.encoding, errors='replace')) + "\n")
#write the length of the tweet
fp.write(" Length of tweet " + str(len(tweet.text)) + "\n")
#write the valence of the tweet
fp.write(" Tweet valance " + str(valence) + "\n")
#write the retweets of the tweet
fp.write(" Retweets count: " + str(tweet.retweet_count) + "\n")
#write the likes of the tweet
fp.write(" Likes count: " + str(tweet.favorite_count) + "\n")
# Report each tweet only once whenever it contains more than one prosocial words
break
else:
#this code runs if the tweet is not prosocial
regular_tweets_count += 1
clean = TextBlob(tweet.text.lower())
valence = clean.sentiment.polarity
counting = tweet.text.lower()
counting_attachments = counting.count(" https://t.co/")
regular_attachments = regular_attachments + counting_attachments
regular_tweets_valence.append(valence)
regular_tweets_len.append(len(tweet.text))
attachments = regular_attachments + social_attachments
我想知道是否有人知道检查推文是否包含图像或视频的好方法。我还想创建一个每个用户平均使用图像和视频的列表。
当我们从 Twitter API 获取数据时,数据采用 JSON 格式。虽然它包含有关该 id 的所有数据,并以值和字段的形式进行评论。所以如果你只是想检查图像是否已经存在,你可以做一个条件语句
if(image == TRUE){
THEN 'yes'
}
ELSE
'no'
如果您查看 This thread,您会发现推文中的所有媒体实际上都存储在 tweet.entities['media']
中。
因此,如果您想知道给定的推文(采用 tweepy 使用的 tweepy.models.Status
格式)是否包含图片,您可以试试这个:
try:
print(True in [medium['type'] == 'photo' for medium in tweet.entities['media']])
except:
print("No picture in this tweet")
希望对您有所帮助。
我已经编写了代码来从用户列表 [handles] 中提取推文。我正在将信息写入名为 "results" 的 .txt 文件。
with open("results", "w") as fp:
for handle in handles:
print("Analyzing tweets from " + handle + "...")
user = api.get_user(id=handle)
fp.write("Handle: " + handle + "\n")
fp.write("Name: " + user.name + "\n")
fp.write("Description: " + str(user.description.encode(sys.stdout.encoding, errors='replace')) + "\n")
fp.write("Followers: " + str(user.followers_count) + "\n")
fp.write("Following: " + str(user.friends_count) + "\n")
tweet_counter = 0
prosocial_tweets_count = 0
regular_tweets_count = 0
all_tweets = []
social_tweets_len = []
regular_tweets_len = []
social_tweets_valence = []
regular_tweets_valence = []
regular_attachments = 0
social_attachments = 0
for tweet in tweepy.Cursor(api.user_timeline, id=user.id).items():
#control for timeline
dt = tweet.created_at
if dt > date_until:
continue
if dt < date_from:
break # XXX: I hope it's OK to break here
if include_retweets == "no" and tweet.text.startswith("RT"):
continue
if include_replies == "no" and tweet.in_reply_to_user_id:
continue
tweet_counter += 1
for word in vocabulary:
if word in tweet.text.lower():
#increase count of pro social tweets
prosocial_tweets_count += 1
#clean the tweet for valence analysis
clean = TextBlob(tweet.text.lower())
#calculate valence
valence = clean.sentiment.polarity
#append the valence to a list
social_tweets_valence.append(valence)
#append the length of the tweet to a list
social_tweets_len.append(len(tweet.text))
#check if there is an attachment
counting = tweet.text.lower()
counting_attachments = counting.count(" https://t.co/")
social_attachments = social_attachments + counting_attachments
#write date
fp.write(" * " + str(dt) + "\n")
#write the tweet
fp.write(" " + str(tweet.text.encode(sys.stdout.encoding, errors='replace')) + "\n")
#write the length of the tweet
fp.write(" Length of tweet " + str(len(tweet.text)) + "\n")
#write the valence of the tweet
fp.write(" Tweet valance " + str(valence) + "\n")
#write the retweets of the tweet
fp.write(" Retweets count: " + str(tweet.retweet_count) + "\n")
#write the likes of the tweet
fp.write(" Likes count: " + str(tweet.favorite_count) + "\n")
# Report each tweet only once whenever it contains more than one prosocial words
break
else:
#this code runs if the tweet is not prosocial
regular_tweets_count += 1
clean = TextBlob(tweet.text.lower())
valence = clean.sentiment.polarity
counting = tweet.text.lower()
counting_attachments = counting.count(" https://t.co/")
regular_attachments = regular_attachments + counting_attachments
regular_tweets_valence.append(valence)
regular_tweets_len.append(len(tweet.text))
attachments = regular_attachments + social_attachments
我想知道是否有人知道检查推文是否包含图像或视频的好方法。我还想创建一个每个用户平均使用图像和视频的列表。
当我们从 Twitter API 获取数据时,数据采用 JSON 格式。虽然它包含有关该 id 的所有数据,并以值和字段的形式进行评论。所以如果你只是想检查图像是否已经存在,你可以做一个条件语句
if(image == TRUE){
THEN 'yes'
}
ELSE
'no'
如果您查看 This thread,您会发现推文中的所有媒体实际上都存储在 tweet.entities['media']
中。
因此,如果您想知道给定的推文(采用 tweepy 使用的 tweepy.models.Status
格式)是否包含图片,您可以试试这个:
try:
print(True in [medium['type'] == 'photo' for medium in tweet.entities['media']])
except:
print("No picture in this tweet")
希望对您有所帮助。