Python: 如何获取一个推特账号对推文的所有回复?
Python: How to get all the replies to Tweets from a Twitter account?
我正在从 Twitter 帐户获取我需要的所有推文。超过 200 条推文;例如 500、600、...
我正在使用 Tweepy 库来帮助我使用 Python 执行此操作,并且我创建了这个对象来执行此操作。
from rrss.twitter_connection import TwitterConnection
import tweepy
class Tweets:
def __init__(self):
self.all_tweets = [] # List of tweets
self.__total_tweets = None
self.__screen_name = None
self.__replies = None
def __del__(self):
del self.all_tweets
del self.screen_name
del self.total_tweets
del self.replies
@property
def screen_name(self): # Screen name of twitter account which we are going to retrieve all their tweets
return self.__screen_name
@screen_name.setter
def screen_name(self, screen_name):
self.__screen_name = screen_name
@screen_name.deleter
def screen_name(self):
del self.__screen_name
@property
def total_tweets(self): # Total tweets which wants to be returned
return self.__total_tweets
@total_tweets.setter
def total_tweets(self, total):
self.__total_tweets = total
@total_tweets.deleter
def total_tweets(self):
del self.__total_tweets
@property
def replies(self):
return self.__replies
@replies.setter
def replies(self, replies):
self.__replies = replies
@replies.deleter
def replies(self):
del self.__replies
@staticmethod
def __get_tweets(total, screen_name, oldest_id=None):
"""
:param total: Number of tweets to return
:param screen_name: Twitter account
:param oldest_id: The last id of the tweet retrieved
:return: A list with at least a number of tweets equal to variable total from the Twitter Account relationed to screen_name variable
"""
api = TwitterConnection().api
if oldest_id is None:
tweets = api.user_timeline(screen_name=screen_name, count=total, include_rts=False, tweet_mode="extended")
else:
tweets = api.user_timeline(screen_name=screen_name, count=total, include_rts=False, max_id=oldest_id - 1, tweet_mode="extended")
return tweets
def get_tweets(self, total, screen_name):
"""
Public method to get a total number of tweets from a screen name
:param total: Total of tweets to retrieve from a screen name
:param screen_name: Twitter account
:return: Update self.all_tweets with all the tweets retrievedd
"""
self.screen_name = screen_name
if total <= 200:
self.all_tweets = Tweets.__get_tweets(total, screen_name)
else:
counter = 200
self.all_tweets = Tweets.__get_tweets(counter, screen_name)
oldest_id = self.all_tweets[-1].id
while len(self.all_tweets) < total:
total_block_tweets = 200 if total - counter > 200 else total - counter
tweets = Tweets.__get_tweets(total_block_tweets, screen_name, oldest_id)
if len(tweets) > 0:
self.all_tweets.extend(tweets)
oldest_id = self.all_tweets[-1].id
counter = len(self.all_tweets)
else:
break
def get_replies(self, tweet_id):
api = TwitterConnection().api
self.replies = tweepy.Cursor(api.search, q='to:{}'.format(self.screen_name), since_id=tweet_id, tweet_mode='extended').items()
def search_replies_to_tweet(self, tweet_id):
while True:
try:
reply = self.replies.next()
print(reply.in_reply_to_status_id)
if reply.in_reply_to_status_id == tweet_id:
print("reply of tweet:{}".format(reply.full_text))
if reply.in_reply_to_status_id_str == str(tweet_id):
except StopIteration:
print("El cursor ha llegado a su final!!!")
break
使用此代码,您可以从 Twitter 帐户“MovistarEstu”获取所有推文:
def main():
t = Tweets()
t.get_tweets(200, "MovistarEstu")
i = 0
for info in t.all_tweets:
print(f"i: {i} - ID: {info.id} - created_at: {info.created_at}")
print(f"text: {info.full_text}\n")
i += 1
你得到了所有的推文,然后你打印了一些关于它们的信息。所有这一切都很好。但是当我尝试获取对“MovistarEstu”自 ID 以来创建的所有推文的所有回复时,我的问题就来了。我收到了一些回复,但不是全部。
例如,我收到了 ID 为 1403443418085265411 但 ID 为 1391368878861824002 的推文的回复,我不知道为什么 :(
使用此代码,我尝试从“MovistarEstu”获取所有推文,因为 ID:1391364490286047238
t.get_replies(1391364490286047238)
现在,我尝试获取对“MovistarEstu”这个 ID 推文的所有回复:1391368878861824002
t.search_replies_to_tweet(1391368878861824002)
但是,我什么也没得到。但是,如果你去 Twitter 你可以检查是否有回复:https://twitter.com/MovistarEstu/status/1391368878861824002
如果您尝试获取此 ID 的所有回复:1403443418085265411
t.search_replies_to_tweet(1403443418085265411)
然后,我就可以找到回复了!!!
reply of tweet:@MovistarEstu Victoria en el 4 partido de la final
reply of tweet:@MovistarEstu Momento que no volveremos a ver en la puta vida
reply of tweet:@MovistarEstu Es buenísimo porque el CM del @MovistarEstu está boicoteando constantemente a su directiva haciéndonos recordar que el pasado fue glorioso y que nos han llevado a la absoluta mediocridad.
reply of tweet:@MovistarEstu No me habéis pedido permiso para usar la foto
reply of tweet:@MovistarEstu Yo estaba ahí con mis compis de cantera
reply of tweet:@MovistarEstu Que salgan los toreros oh oh oh!!!! reply of tweet:@MovistarEstu Entonces salían los toreros habitualmente, ahora sólo salen los torreznos
reply of tweet:@MovistarEstu Cualquier tiempo pasado fue mejor. Asensio ya estaba por aquel entonces mamando del frasco?
reply of tweet:@MovistarEstu Claro, cuando Nacho aprobó la selectividad a la 17a
reply of tweet:@MovistarEstu 17 años ya!!! Lo recuerdo como si fuera ayer. Se forzó quinto partido de la final ACB con el Farsa. Patterson, Nicola Loncar...
reply of tweet:@MovistarEstu Segundo partido en Vistalegre de la final de liga contra el FCBarcelona. Tremenda exhibición, ambientazo en las gradas y 2-2. Todo se decidirá en el Palau (cuando ya debía estar finiquitada la final tras algún arbitraje "ejem-ejem" en Barcelona)...
reply of tweet:@MovistarEstu Pase a la final ACB?
我做错了什么?
来自 Twitter standard search API that Tweepy's API.search
用途的文档:
Keep in mind that the search index has a 7-day limit. In other words, no tweets will be found for a date older than one week.
https://developer.twitter.com/en/docs/twitter-api/v1/tweets/search/guides/standard-operators 还说:
The Search API is not a complete index of all Tweets, but instead an index of recent Tweets. The index includes between 6-9 days of Tweets.
我正在从 Twitter 帐户获取我需要的所有推文。超过 200 条推文;例如 500、600、...
我正在使用 Tweepy 库来帮助我使用 Python 执行此操作,并且我创建了这个对象来执行此操作。
from rrss.twitter_connection import TwitterConnection
import tweepy
class Tweets:
def __init__(self):
self.all_tweets = [] # List of tweets
self.__total_tweets = None
self.__screen_name = None
self.__replies = None
def __del__(self):
del self.all_tweets
del self.screen_name
del self.total_tweets
del self.replies
@property
def screen_name(self): # Screen name of twitter account which we are going to retrieve all their tweets
return self.__screen_name
@screen_name.setter
def screen_name(self, screen_name):
self.__screen_name = screen_name
@screen_name.deleter
def screen_name(self):
del self.__screen_name
@property
def total_tweets(self): # Total tweets which wants to be returned
return self.__total_tweets
@total_tweets.setter
def total_tweets(self, total):
self.__total_tweets = total
@total_tweets.deleter
def total_tweets(self):
del self.__total_tweets
@property
def replies(self):
return self.__replies
@replies.setter
def replies(self, replies):
self.__replies = replies
@replies.deleter
def replies(self):
del self.__replies
@staticmethod
def __get_tweets(total, screen_name, oldest_id=None):
"""
:param total: Number of tweets to return
:param screen_name: Twitter account
:param oldest_id: The last id of the tweet retrieved
:return: A list with at least a number of tweets equal to variable total from the Twitter Account relationed to screen_name variable
"""
api = TwitterConnection().api
if oldest_id is None:
tweets = api.user_timeline(screen_name=screen_name, count=total, include_rts=False, tweet_mode="extended")
else:
tweets = api.user_timeline(screen_name=screen_name, count=total, include_rts=False, max_id=oldest_id - 1, tweet_mode="extended")
return tweets
def get_tweets(self, total, screen_name):
"""
Public method to get a total number of tweets from a screen name
:param total: Total of tweets to retrieve from a screen name
:param screen_name: Twitter account
:return: Update self.all_tweets with all the tweets retrievedd
"""
self.screen_name = screen_name
if total <= 200:
self.all_tweets = Tweets.__get_tweets(total, screen_name)
else:
counter = 200
self.all_tweets = Tweets.__get_tweets(counter, screen_name)
oldest_id = self.all_tweets[-1].id
while len(self.all_tweets) < total:
total_block_tweets = 200 if total - counter > 200 else total - counter
tweets = Tweets.__get_tweets(total_block_tweets, screen_name, oldest_id)
if len(tweets) > 0:
self.all_tweets.extend(tweets)
oldest_id = self.all_tweets[-1].id
counter = len(self.all_tweets)
else:
break
def get_replies(self, tweet_id):
api = TwitterConnection().api
self.replies = tweepy.Cursor(api.search, q='to:{}'.format(self.screen_name), since_id=tweet_id, tweet_mode='extended').items()
def search_replies_to_tweet(self, tweet_id):
while True:
try:
reply = self.replies.next()
print(reply.in_reply_to_status_id)
if reply.in_reply_to_status_id == tweet_id:
print("reply of tweet:{}".format(reply.full_text))
if reply.in_reply_to_status_id_str == str(tweet_id):
except StopIteration:
print("El cursor ha llegado a su final!!!")
break
使用此代码,您可以从 Twitter 帐户“MovistarEstu”获取所有推文:
def main():
t = Tweets()
t.get_tweets(200, "MovistarEstu")
i = 0
for info in t.all_tweets:
print(f"i: {i} - ID: {info.id} - created_at: {info.created_at}")
print(f"text: {info.full_text}\n")
i += 1
你得到了所有的推文,然后你打印了一些关于它们的信息。所有这一切都很好。但是当我尝试获取对“MovistarEstu”自 ID 以来创建的所有推文的所有回复时,我的问题就来了。我收到了一些回复,但不是全部。
例如,我收到了 ID 为 1403443418085265411 但 ID 为 1391368878861824002 的推文的回复,我不知道为什么 :(
使用此代码,我尝试从“MovistarEstu”获取所有推文,因为 ID:1391364490286047238
t.get_replies(1391364490286047238)
现在,我尝试获取对“MovistarEstu”这个 ID 推文的所有回复:1391368878861824002
t.search_replies_to_tweet(1391368878861824002)
但是,我什么也没得到。但是,如果你去 Twitter 你可以检查是否有回复:https://twitter.com/MovistarEstu/status/1391368878861824002
如果您尝试获取此 ID 的所有回复:1403443418085265411
t.search_replies_to_tweet(1403443418085265411)
然后,我就可以找到回复了!!!
reply of tweet:@MovistarEstu Victoria en el 4 partido de la final
reply of tweet:@MovistarEstu Momento que no volveremos a ver en la puta vida
reply of tweet:@MovistarEstu Es buenísimo porque el CM del @MovistarEstu está boicoteando constantemente a su directiva haciéndonos recordar que el pasado fue glorioso y que nos han llevado a la absoluta mediocridad.
reply of tweet:@MovistarEstu No me habéis pedido permiso para usar la foto
reply of tweet:@MovistarEstu Yo estaba ahí con mis compis de cantera
reply of tweet:@MovistarEstu Que salgan los toreros oh oh oh!!!! reply of tweet:@MovistarEstu Entonces salían los toreros habitualmente, ahora sólo salen los torreznos
reply of tweet:@MovistarEstu Cualquier tiempo pasado fue mejor. Asensio ya estaba por aquel entonces mamando del frasco?
reply of tweet:@MovistarEstu Claro, cuando Nacho aprobó la selectividad a la 17a
reply of tweet:@MovistarEstu 17 años ya!!! Lo recuerdo como si fuera ayer. Se forzó quinto partido de la final ACB con el Farsa. Patterson, Nicola Loncar...
reply of tweet:@MovistarEstu Segundo partido en Vistalegre de la final de liga contra el FCBarcelona. Tremenda exhibición, ambientazo en las gradas y 2-2. Todo se decidirá en el Palau (cuando ya debía estar finiquitada la final tras algún arbitraje "ejem-ejem" en Barcelona)...
reply of tweet:@MovistarEstu Pase a la final ACB?
我做错了什么?
来自 Twitter standard search API that Tweepy's API.search
用途的文档:
Keep in mind that the search index has a 7-day limit. In other words, no tweets will be found for a date older than one week.
https://developer.twitter.com/en/docs/twitter-api/v1/tweets/search/guides/standard-operators 还说:
The Search API is not a complete index of all Tweets, but instead an index of recent Tweets. The index includes between 6-9 days of Tweets.