使用 Python 和 Tweepy 查询 Twitter 状态

Query Twitter Status by Using Python and Tweepy

我尝试使用包含在推文文本中的指定关键字来查询指定用户的推文。这是我的代码:

# Import Tweepy, sleep, credentials.py
import tweepy
from time import sleep
from credentials import *

# Access and authorize our Twitter credentials from credentials.py
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

SCREEN_NAME = "BachelorABC"
KEYWORD = "TheBachelor"

def twtr2():
    raw_tweets = tweepy.Cursor(api.search, q=KEYWORD, lang="en").items(50)
    for tweet in raw_tweets:
        if tweet['user']['screen_name'] == SCREEN_NAME:
            print tweet
twtr2()

我收到如下错误信息:

Traceback (most recent call last):
  File "test2.py", line 19, in <module>
    twtr2()
  File "test2.py", line 17, in twtr2
    if tweet['user']['screen_name'] == SCREEN_NAME:
TypeError: 'Status' object has no attribute '__getitem__'

我在谷歌上搜索了很多,认为也许我需要先将 Twitter 的 JSON 保存在 python 中,所以我尝试了以下操作:

import tweepy, json
from time import sleep
from credentials import *

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

SCREEN_NAME = "BachelorABC"
KEYWORD = "TheBachelor"

raw_tweets = tweepy.Cursor(api.search, q=KEYWORD, lang="en").items(50)
for tweet in raw_tweets:
    load_tweet = json.loads(tweet)
    if load_tweet['user']['screen_name'] == SCREEN_NAME:
        print tweet

然而,结果很悲催:

Traceback (most recent call last):
  File "test2.py", line 35, in <module>
    load_tweet = json.loads(tweet)
  File "C:\Python27\lib\json\__init__.py", line 339, in loads
    return _default_decoder.decode(s)
  File "C:\Python27\lib\json\decoder.py", line 364, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer

有谁知道我的代码有什么问题吗?你能帮我修一下吗?

提前致谢!

问题在于

load_tweet = json.loads(tweet)

"tweet" 对象不是 JSON 对象。如果你想使用 JSON 对象,请按照 post 了解如何使用 JSON 对象和 tweepy。

为了实现您的目标(打印 50 条提要的每条推文),我将遵循 getting started docs:

中的说明
import tweepy

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

api = tweepy.API(auth)

public_tweets = api.home_timeline()
for tweet in public_tweets:
    print(tweet.text)

我想通了。这是解决方案:

# Import Tweepy, sleep, credentials.py
import tweepy
from time import sleep
from credentials import *

# Access and authorize our Twitter credentials from credentials.py
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

SCREEN_NAME = "BachelorABC"
KEYWORD = "TheBachelor"
for tweet in tweepy.Cursor(api.search, q=KEYWORD, lang="en").items(200):
    if tweet.user.screen_name == SCREEN_NAME:
        print tweet.text
        print tweet.user.screen_name

请注意,这不是找到同时满足指定条件(screen_name 和关键字)的推文的有效方法。这是因为我们先按关键字查询,再按screen_name查询。如果关键字非常受欢迎,就像我在这里使用的 "TheBachelor",推文数量有限 (200),我们可能会发现 200 条推文中的 none 是由指定的 screen_name 发送的.我想如果先按screen_name查询,再按关键字查询,也许会得到更好的结果。但这不在讨论之列。

我会把你留在这里。