使用 R 像 Twitter 存档一样格式化推文

Format Tweets just like the Twitter Archive using R

我想知道如何像 Twitter 存档那样格式化 CSV 文件,这样 R 就不会在读取它时遇到问题(运行 遇到一堆问题而没有解决方案)。 Twitter 存档是用户时间线,我的 CSV(我将在其上使用 R 执行情绪分析)是包含推文的搜索结果。

Twitter 存档示例

"tweet_id","in_reply_to_status_id","in_reply_to_user_id","timestamp","source","text","retweeted_status_id","retweeted_status_user_id","retweeted_status_timestamp","expanded_urls"
"81423594213695488","","","2016-12-29 14:18:08 +0000","<a href=""http://twitter.com/download/android"" rel=""nofollow"">Twitter for Android</a>","RT @SwiftOnSecurity: We're going to tell kids that laptops used to store data on tiny mirrors spinning @ 7200rpm and they're going to think…","814187405175570432","2436389418","2016-12-28 19:12:58 +0000",""
"876926582348550143","","","2016-12-22 13:29:16 +0000","<a href=""http://twitter.com/download/android"" rel=""nofollow"">Twitter for Android</a>","RT @MKBHD: Shout-out to everyone going home and becoming family tech support for the holidays","811910809521680384","29873662","2016-12-22 12:26:36 +0000",""

到目前为止我做了什么

"text"
b'RT @notCORYGREGORY: when hillary uses a private email server asking how to print recipes vs when trump takes healthcare from 20+ million am\xe2\x80\xa6'
b'RT @Salon: Germany is giving up on President Trump'

我在 Python 中是如何做到的:

csvFile = open('tweets.csv', 'a')
csvWriter = csv.writer(csvFile, delimiter=',')

for tweet in tweepy.Cursor(api.search,
    q="trump",
    rpp=100,
    result_type="recent",
    include_entities=True,
    lang="en").items(5):
        print (tweet.text)
        csvWriter.writerow([tweet.text.encode('utf-8')])

csvFile.close()

我愿意接受 R 中的解决方案

我不完全理解你的问题,但你可能想看看 R 中的 twitteR 库,尤其是函数 "twListToDF"。如果将它与 write.csv 结合使用,您可以将收集的推文以 csv 格式正确,R 也可以读取。

write.csv(twListToDF(your_tweets), file="your_tweets.csv")