pywren 和 twint - 推文下载
pywren and twint - Tweet download
以下代码获取用户名并从给定日期抓取他们的推特历史记录
import pandas as pd
import twint
import pywren
def scrape_user(username):
c = twint.Config()
c.Username = username
c.Lang = 'es'
c.Since = '2021-04-28'
c.Hide_output = True
c.Pandas = True
twint.run.Search(c)
return twint.storage.panda.Tweets_df
当我 运行 函数时,我得到了预期的结果,即 Pandas 数据帧,例如 scrape_user("DeLaCalleHum")。但是,当我使用 pywren 时(即使是一个用户名)
pwex = pywren.default_executor()
futures = pwex.map(scrape_user, "DeLaCalleHum")
tweet_list = pywren.get_all_results(futures)
我收到这个错误。
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-31-15f9e00ead75> in <module>
----> 1 wc_list = pywren.get_all_results(futures)
~/macs30123/lib/python3.7/site-packages/pywren/wren.py in get_all_results(fs)
117 """
118 wait(fs, return_when=ALL_COMPLETED)
--> 119 return [f.result() for f in fs]
~/macs30123/lib/python3.7/site-packages/pywren/wren.py in <listcomp>(.0)
117 """
118 wait(fs, return_when=ALL_COMPLETED)
--> 119 return [f.result() for f in fs]
~/macs30123/lib/python3.7/site-packages/pywren/future.py in result(self, timeout, check_only, throw_except, storage_handler)
146 if self._state == JobState.error:
147 if throw_except:
--> 148 raise self._exception
149 else:
150 return None
OSError: [Errno 28] No space left on device
我做错了什么?如果有任何帮助,我将不胜感激。
一段时间后我找到了答案。只要我将 ComprehendFullAccess 策略添加到我在 IAM
中的 pywren_exec_role_1 角色,我就可以在 PyWren 中自动并行化此类函数调用
以下代码获取用户名并从给定日期抓取他们的推特历史记录
import pandas as pd
import twint
import pywren
def scrape_user(username):
c = twint.Config()
c.Username = username
c.Lang = 'es'
c.Since = '2021-04-28'
c.Hide_output = True
c.Pandas = True
twint.run.Search(c)
return twint.storage.panda.Tweets_df
当我 运行 函数时,我得到了预期的结果,即 Pandas 数据帧,例如 scrape_user("DeLaCalleHum")。但是,当我使用 pywren 时(即使是一个用户名)
pwex = pywren.default_executor()
futures = pwex.map(scrape_user, "DeLaCalleHum")
tweet_list = pywren.get_all_results(futures)
我收到这个错误。
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-31-15f9e00ead75> in <module>
----> 1 wc_list = pywren.get_all_results(futures)
~/macs30123/lib/python3.7/site-packages/pywren/wren.py in get_all_results(fs)
117 """
118 wait(fs, return_when=ALL_COMPLETED)
--> 119 return [f.result() for f in fs]
~/macs30123/lib/python3.7/site-packages/pywren/wren.py in <listcomp>(.0)
117 """
118 wait(fs, return_when=ALL_COMPLETED)
--> 119 return [f.result() for f in fs]
~/macs30123/lib/python3.7/site-packages/pywren/future.py in result(self, timeout, check_only, throw_except, storage_handler)
146 if self._state == JobState.error:
147 if throw_except:
--> 148 raise self._exception
149 else:
150 return None
OSError: [Errno 28] No space left on device
我做错了什么?如果有任何帮助,我将不胜感激。
一段时间后我找到了答案。只要我将 ComprehendFullAccess 策略添加到我在 IAM
中的 pywren_exec_role_1 角色,我就可以在 PyWren 中自动并行化此类函数调用