我如何通过在 tweepy 上进行简单搜索来创建 CSV?
How do i create a CSV with a simple search on tweepy?
我正在尝试制作一个脚本,以 .CSV 格式下载 Twitter 搜索,但是,我的代码有错误,有帮助吗???
import tweepy
import csv
import pandas as pd
####input your credentials here
consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True)
#####United Airlines
# Open/Create a file to append data
csvFile = open('test.csv', 'a')
#Use csv Writer
csvWriter = csv.writer(csvFile)
for tweet in tweepy.Cursor(api.search,q="petya",count=100,
since="2017-04-03").items():
print ("ID:", tweet.id)
print ("User ID:", tweet.user.id)
print ("Text:", tweet.text)
print ("Created:", tweet.created_at)
print ("Geo:", tweet.geo)
print ("Contributors:", tweet.contributors)
print ("Coordinates:", tweet.coordinates)
print ("Favorited:", tweet.favorited)
print ("In reply to screen name:", tweet.in_reply_to_screen_name)
print ("In reply to status ID:", tweet.in_reply_to_status_id)
print ("In reply to status ID str:", tweet.in_reply_to_status_id_str)
print ("In reply to user ID:", tweet.in_reply_to_user_id)
print ("In reply to user ID str:", tweet.in_reply_to_user_id_str)
print ("Place:", tweet.place)
print ("Retweeted:", tweet.retweeted)
print ("Retweet count:", tweet.retweet_count)
print ("Source:", tweet.source)
print ("Truncated:", tweet.truncated)
# Write a row to the CSV file. I use encode UTF-8
csvWriter.writerow([tweet.created_at, tweet.user.id, tweet.id, tweet.geo, tweet.text, tweet.contributors, tweet.favorited, tweet.source, tweet.retweeted, tweet.in_reply_to_screen_name, eet.in_reply_to_status_id_str('utf-8')])
print tweet.created_at, tweet.user.id, tweet.id, tweet.geo, tweet.text, tweet.contributors, tweet.favorited, tweet.source, tweet.retweeted, tweet.in_reply_to_screen_name, eet.in_reply_to_status_id_str
csvFile.close()
我认为问题出在 csvWriter 所在的最后一部分,也许我在一行中放置了很多文本?正如我之前所说,我是新手,需要大量帮助。
我认为最简单的解决方案是使用 pandas(有趣的是,您导入了但没有使用)。
可行的解决方案可能如下所示:
import tweepy
import pandas as pd
consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True)
# create list to append tweets to
tweets = []
# append all tweet data to list
for tweet in tweepy.Cursor(api.search,q="petya",count=100,
since="2017-04-03").items():
tweets.append(tweet)
# convert 'tweets' list to pandas.DataFrame
tweets_df = pd.DataFrame(vars(tweets[i]) for i in range(len(tweets)))
# define file path (string) to save csv file to
FILE_PATH = </path/to/file.csv>
# use pandas to save dataframe to csv
tweets_df.to_csv(FILE_PATH)
砰,大功告成!
请注意,如果您只想 select 一组特定的推文,您可以只创建一个列表,然后再对数据帧进行子集化。
例如(在将推文转换为 pandas.DataFrame 的步骤之后):
# define attributes you want
tweet_atts = [
'text', 'created_at', 'favorite_count',
'lang', 'retweet_count', 'source',
'in_reply_to_user_id_str', 'retweeted',
'id'
]
# subset dataframe
tweets_df = tweets_df[tweets_atts]
# save resulting df to csv
tweets_df.to_csv(FILE_PATH)
如果您需要更多帮助,请随时回复!
我正在尝试制作一个脚本,以 .CSV 格式下载 Twitter 搜索,但是,我的代码有错误,有帮助吗???
import tweepy
import csv
import pandas as pd
####input your credentials here
consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True)
#####United Airlines
# Open/Create a file to append data
csvFile = open('test.csv', 'a')
#Use csv Writer
csvWriter = csv.writer(csvFile)
for tweet in tweepy.Cursor(api.search,q="petya",count=100,
since="2017-04-03").items():
print ("ID:", tweet.id)
print ("User ID:", tweet.user.id)
print ("Text:", tweet.text)
print ("Created:", tweet.created_at)
print ("Geo:", tweet.geo)
print ("Contributors:", tweet.contributors)
print ("Coordinates:", tweet.coordinates)
print ("Favorited:", tweet.favorited)
print ("In reply to screen name:", tweet.in_reply_to_screen_name)
print ("In reply to status ID:", tweet.in_reply_to_status_id)
print ("In reply to status ID str:", tweet.in_reply_to_status_id_str)
print ("In reply to user ID:", tweet.in_reply_to_user_id)
print ("In reply to user ID str:", tweet.in_reply_to_user_id_str)
print ("Place:", tweet.place)
print ("Retweeted:", tweet.retweeted)
print ("Retweet count:", tweet.retweet_count)
print ("Source:", tweet.source)
print ("Truncated:", tweet.truncated)
# Write a row to the CSV file. I use encode UTF-8
csvWriter.writerow([tweet.created_at, tweet.user.id, tweet.id, tweet.geo, tweet.text, tweet.contributors, tweet.favorited, tweet.source, tweet.retweeted, tweet.in_reply_to_screen_name, eet.in_reply_to_status_id_str('utf-8')])
print tweet.created_at, tweet.user.id, tweet.id, tweet.geo, tweet.text, tweet.contributors, tweet.favorited, tweet.source, tweet.retweeted, tweet.in_reply_to_screen_name, eet.in_reply_to_status_id_str
csvFile.close()
我认为问题出在 csvWriter 所在的最后一部分,也许我在一行中放置了很多文本?正如我之前所说,我是新手,需要大量帮助。
我认为最简单的解决方案是使用 pandas(有趣的是,您导入了但没有使用)。
可行的解决方案可能如下所示:
import tweepy
import pandas as pd
consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True)
# create list to append tweets to
tweets = []
# append all tweet data to list
for tweet in tweepy.Cursor(api.search,q="petya",count=100,
since="2017-04-03").items():
tweets.append(tweet)
# convert 'tweets' list to pandas.DataFrame
tweets_df = pd.DataFrame(vars(tweets[i]) for i in range(len(tweets)))
# define file path (string) to save csv file to
FILE_PATH = </path/to/file.csv>
# use pandas to save dataframe to csv
tweets_df.to_csv(FILE_PATH)
砰,大功告成!
请注意,如果您只想 select 一组特定的推文,您可以只创建一个列表,然后再对数据帧进行子集化。
例如(在将推文转换为 pandas.DataFrame 的步骤之后):
# define attributes you want
tweet_atts = [
'text', 'created_at', 'favorite_count',
'lang', 'retweet_count', 'source',
'in_reply_to_user_id_str', 'retweeted',
'id'
]
# subset dataframe
tweets_df = tweets_df[tweets_atts]
# save resulting df to csv
tweets_df.to_csv(FILE_PATH)
如果您需要更多帮助,请随时回复!