使此 pandas 代码尽可能精简和快速? [迭代大型 DataFrame 和设置]
Making this pandas code as lean and speedy as possible? [iterating over large DataFrames and setting]
就上下文而言,我的主数据集是一个 24541 行 x 1830 列的 DataFrame,其中包含 NaN 或浮点数(股票价格)。我正在处理这个 DataFrame 11 次,每次都在具有相同索引和列的铸造 DataFrame 中设置值。下面是两个 DataFrame 的示例:
data = pd.DataFrame.from_csv(filepath)
data = pd.DataFrame(data=data, dtype=np.float64)
#dataset of daily prices
data.head()
Out[14]:
49154 65541 32791 65568 ... 24563 81910 24571 90110
DATE ...
1925-12-31 NaN NaN NaN NaN ... NaN NaN NaN NaN
1926-01-02 NaN NaN NaN NaN ... NaN NaN NaN NaN
1926-01-04 NaN NaN NaN NaN ... NaN NaN NaN NaN
1926-01-05 NaN NaN NaN NaN ... NaN NaN NaN NaN
1926-01-06 NaN NaN NaN NaN ... NaN NaN NaN NaN
[5 rows x 1830 columns]
MA_a_frame = pd.DataFrame(
data=0,
index=data.index,
columns=data.columns)
#bool DataFrame
MA_a_frame.head()
Out[15]:
49154 65541 32791 65568 ... 24563 81910 24571 90110
DATE ...
1925-12-31 0 0 0 0 ... 0 0 0 0
1926-01-02 0 0 0 0 ... 0 0 0 0
1926-01-04 0 0 0 0 ... 0 0 0 0
1926-01-05 0 0 0 0 ... 0 0 0 0
1926-01-06 0 0 0 0 ... 0 0 0 0
[5 rows x 1830 columns]
如果满足 DataFrame "data" 中的特定条件,MA_a_frame(以及其他 10 个相同的 DataFrame)中的值将被设置为 1。即,如果 "data" 中的价格在 完全不同的 DataFrame 中计算值的 1% 以内(参数为 "j"),该数据帧是在前一个函数中生成的。因此,每次迭代总共将处理最多 3 个大型 DataFrame。
就我的迭代器而言,我只是使用 data.columns 和 data.index 创建了两个单独的列表("dates" 和 "securities")。所以我实际上是在间接迭代数据的索引和列。事不宜迟,这里是我的程序中总共 运行 11 次的代码基础(我正在尝试加速的部分!):
def gen_a():
for date in dates:
for security in securities:
try:
if type(data.loc[date, security]) is not float:
pass
#lots of the data is NaN, so skip these altogether
elif j > math.log(
MA_a_csv.loc[date, security]/
data.loc[date, security]) > -j:
MA_dict['a'].loc[date, security] = 1
print(f'Passed {date}, {security}')
except:
print(f'Failed {date}, {security}')
现在,问题是这段代码的一个循环需要大约 8 个小时。因此,我预计每个 运行 将近 90 个小时。我有一篇学术论文作为毕业要求到期,截止日期真的开始让我害怕这些数字了!假设我的输出是完美的,事情应该没问题,但如果有人提出可以降低速度的建议,我将永远感激不已。否则,我可能不得不缩小数据范围,从而降低统计分析的能力。
P.S。我正在 运行 通过 Spyder 在 Windows 10 上使用 Intel i7 3970X 进行此操作。我无权使用任何其他计算能力。我考虑过 GPU 加速,但我的 GPU 是 GTX 670,它不是 Pascal,因此与 CuDF 不兼容。
编辑:
这是数据 DataFrame 的后五行:
s.head()
Out[16]:
49154 65541 32791 65568 ... 24563 81910 24571 90110
DATE ...
2018-12-24 61.55 232.70000 NaN NaN ... NaN 15.71 NaN NaN
2018-12-26 65.11 244.59000 NaN NaN ... NaN 16.48 NaN NaN
2018-12-27 64.71 252.17999 NaN NaN ... NaN 16.71 NaN NaN
2018-12-28 64.96 249.64999 NaN NaN ... NaN 16.55 NaN NaN
2018-12-31 66.09 254.50000 NaN NaN ... NaN 16.74 NaN NaN
[5 rows x 1830 columns]
这里是比较数据帧之一的示例:
Out[23]:
49154 65541 32791 65568 ... 24563 81910 24571 90110
DATE ...
2018-12-24 76.3430 258.376200 NaN NaN ... NaN 19.8672 NaN NaN
2018-12-26 75.9530 258.143600 NaN NaN ... NaN 19.7980 NaN NaN
2018-12-27 75.5552 258.127199 NaN NaN ... NaN 19.7238 NaN NaN
2018-12-28 75.1382 257.878799 NaN NaN ... NaN 19.6440 NaN NaN
2018-12-31 74.7716 257.683199 NaN NaN ... NaN 19.5600 NaN NaN
[5 rows x 1830 columns]
编辑 2:
应要求,这里是 data.head()。to_dict():
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'44792': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85753': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20220': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12044': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20239': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28433': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12052': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12060': {Timestamp('1925-12-31 00:00:00'): 326.0,
Timestamp('1926-01-02 00:00:00'): 326.5,
Timestamp('1926-01-04 00:00:00'): 325.0,
Timestamp('1926-01-05 00:00:00'): 325.5,
Timestamp('1926-01-06 00:00:00'): 326.25},
'12062': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85792': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12067': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77605': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77606': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20263': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12073': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12076': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12079': {Timestamp('1925-12-31 00:00:00'): 117.5,
Timestamp('1926-01-02 00:00:00'): 124.25,
Timestamp('1926-01-04 00:00:00'): 127.125,
Timestamp('1926-01-05 00:00:00'): 123.75,
Timestamp('1926-01-06 00:00:00'): 124.5},
'61241': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12095': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28484': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'53065': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20298': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77644': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28505': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'53081': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77659': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12124': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77661': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28513': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'61284': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77668': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12140': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85869': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20343': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28548': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77702': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12167': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85908': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12183': {Timestamp('1925-12-31 00:00:00'): 78.5,
Timestamp('1926-01-02 00:00:00'): 78.0,
Timestamp('1926-01-04 00:00:00'): 77.5,
Timestamp('1926-01-05 00:00:00'): 76.875,
Timestamp('1926-01-06 00:00:00'): 76.5},
'44951': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85913': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85914': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12191': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20386': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77730': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28580': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85926': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20394': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'69550': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12212': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20407': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12220': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20415': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77768': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85963': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20431': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'45014': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'61399': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'69607': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85991': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'53225': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20474': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20482': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86021': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'45065': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12298': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'69649': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12308': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20503': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'45081': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86041': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12319': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20511': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12343': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12345': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20554': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12369': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20562': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86102': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20570': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86111': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12394': {Timestamp('1925-12-31 00:00:00'): 123.5,
Timestamp('1926-01-02 00:00:00'): 124.0,
Timestamp('1926-01-04 00:00:00'): 123.25,
Timestamp('1926-01-05 00:00:00'): 123.5,
Timestamp('1926-01-06 00:00:00'): 122.75},
'36978': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86136': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28804': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86158': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12431': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'61583': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20626': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77976': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'53401': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86176': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12449': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'69796': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12456': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'45225': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12458': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20650': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28847': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
...}
不幸的是,对于这个 post,我超出了 space,但是 MA_a_csv.head().to_dict() 产生与上面相同的结果,除了所有 NaN 而不是一个数据点。
也许在读取 csv 时使用 chunksize
参数。您需要尝试确定要使用的最佳大小,但我听说一个很好的经验法则是将其设置为可用内存的一半大小。
df = pd.read_csv("your.csv", chucksize=memory/2)
将结果写回文件时,您需要确保追加参数集:
df.to_csv("yourresults.csv", mode='a')
要么在每次 运行 代码时删除文件,要么确保 to_csv()
的第一次调用以写模式(默认)完成。
我会尝试的其他选项:
1) 使用 AWS EC2 等云资源并购买高规格的高内存机器,将您的数据和代码传输到它上面并让它 运行 您的代码。应该会快很多。
2) 我会考虑使用 Pyspark 之类的东西在多台机器上分配负载,但如果还不熟悉的话,这可能需要一些时间才能跟上速度。
祝你好运!
将两个简短的评论组合成一个答案。
1) 语句
j > math.log(
MA_a_csv.loc[date, security]/
data.loc[date, security]) > -j
可以通过 abs
稍微简化,例如j > abs(...)
并且可以通过单独计算一次日志并利用 log(a/b) == log(a) - log(b)
.
这一事实来显着加快速度
即使只对一个单元格进行一次计算,您也可以计算它并将其写回,以加快重新运行的速度。
2) 如果您在实际代码中有这些打印语句,它们将占用总时间的很大一部分。
我根据您提供的示例制作了自己的示例数据生成器。我认为它适合您所拥有的,但如果不适合请告诉我。如果数据匹配,请不要担心我是如何制作的细节。
rows = 6
cols = 5
np.random.seed(0)
data = pd.DataFrame(np.random.rand(rows, cols) * 100,
index=pd.DatetimeIndex(freq='d', start='1928-12-31', periods=rows))
nan_cols = len(data.columns) // 2
random_indices = zip(pd.Series(data.index.values[:-rows // 2])
.sample(nan_cols, random_state=1, replace=True),
pd.Series(data.columns).sample(nan_cols, random_state=2))
for row, col in random_indices:
data.loc[:row, col] = np.nan
MA_a_csv = data * (1 + (np.random.rand(rows, cols) / 50
* np.random.choice([-1, 1], size=(rows, cols))))
所以data
看起来像
0 1 2 3 4
1928-12-31 54.881350 71.518937 NaN 54.488318 NaN
1929-01-01 64.589411 43.758721 NaN 96.366276 38.344152
1929-01-02 79.172504 52.889492 56.804456 92.559664 7.103606
1929-01-03 8.712930 2.021840 83.261985 77.815675 87.001215
1929-01-04 97.861834 79.915856 46.147936 78.052918 11.827443
1929-01-05 63.992102 14.335329 94.466892 52.184832 41.466194
而且MA_a_csv
看起来像
0 1 2 3 4
1928-12-31 55.171734 72.626384 NaN 55.107778 NaN
1929-01-01 63.791557 44.294412 NaN 98.185186 38.867028
1929-01-02 78.603241 53.351780 57.597027 92.448175 7.008877
1929-01-03 8.829794 2.013333 83.047291 77.324770 86.368349
1929-01-04 98.977844 80.616881 45.235708 77.893620 11.876852
1929-01-05 63.785651 14.522579 94.945445 52.671519 41.668902
我运行通过看起来像你的gen_a
的东西,然后制作了一个矢量化版本,得到了相同的答案:
logs = np.log(MA_a_csv / data)
ans = ((j > logs) & (logs > -j)).replace({True: 1, False: 0})
其中 ans
是
0 1 2 3 4
1928-12-31 1 0 0 0 0
1929-01-01 0 0 0 0 0
1929-01-02 1 1 0 1 0
1929-01-03 0 1 1 1 1
1929-01-04 0 1 0 1 1
1929-01-05 1 0 1 1 1
np.log
可以一次对整个数组进行操作,并且 pandas 可能也在做一些奇特的事情来矢量化大于比较。 &
是按位和,所以它只是检查每个位置的两个条件是否都为真。
这比我的 gen_a
版本快 180 倍,后者没有 try/except 或 print 语句,因此对您的代码来说应该是一个更大的改进。
您也不需要 .replace({True: 1, False: 0})
部分 - Python 1 == True
和 0 == False
一样,因此您应该可以互换使用它们.
如果您对此有任何问题,请告诉我。如需进一步阅读,我建议阅读 Tom Augspurger 的现代 Pandas 文章 - 特别适用的是 Fast Pandas 部分。
就上下文而言,我的主数据集是一个 24541 行 x 1830 列的 DataFrame,其中包含 NaN 或浮点数(股票价格)。我正在处理这个 DataFrame 11 次,每次都在具有相同索引和列的铸造 DataFrame 中设置值。下面是两个 DataFrame 的示例:
data = pd.DataFrame.from_csv(filepath)
data = pd.DataFrame(data=data, dtype=np.float64)
#dataset of daily prices
data.head()
Out[14]:
49154 65541 32791 65568 ... 24563 81910 24571 90110
DATE ...
1925-12-31 NaN NaN NaN NaN ... NaN NaN NaN NaN
1926-01-02 NaN NaN NaN NaN ... NaN NaN NaN NaN
1926-01-04 NaN NaN NaN NaN ... NaN NaN NaN NaN
1926-01-05 NaN NaN NaN NaN ... NaN NaN NaN NaN
1926-01-06 NaN NaN NaN NaN ... NaN NaN NaN NaN
[5 rows x 1830 columns]
MA_a_frame = pd.DataFrame(
data=0,
index=data.index,
columns=data.columns)
#bool DataFrame
MA_a_frame.head()
Out[15]:
49154 65541 32791 65568 ... 24563 81910 24571 90110
DATE ...
1925-12-31 0 0 0 0 ... 0 0 0 0
1926-01-02 0 0 0 0 ... 0 0 0 0
1926-01-04 0 0 0 0 ... 0 0 0 0
1926-01-05 0 0 0 0 ... 0 0 0 0
1926-01-06 0 0 0 0 ... 0 0 0 0
[5 rows x 1830 columns]
如果满足 DataFrame "data" 中的特定条件,MA_a_frame(以及其他 10 个相同的 DataFrame)中的值将被设置为 1。即,如果 "data" 中的价格在 完全不同的 DataFrame 中计算值的 1% 以内(参数为 "j"),该数据帧是在前一个函数中生成的。因此,每次迭代总共将处理最多 3 个大型 DataFrame。
就我的迭代器而言,我只是使用 data.columns 和 data.index 创建了两个单独的列表("dates" 和 "securities")。所以我实际上是在间接迭代数据的索引和列。事不宜迟,这里是我的程序中总共 运行 11 次的代码基础(我正在尝试加速的部分!):
def gen_a():
for date in dates:
for security in securities:
try:
if type(data.loc[date, security]) is not float:
pass
#lots of the data is NaN, so skip these altogether
elif j > math.log(
MA_a_csv.loc[date, security]/
data.loc[date, security]) > -j:
MA_dict['a'].loc[date, security] = 1
print(f'Passed {date}, {security}')
except:
print(f'Failed {date}, {security}')
现在,问题是这段代码的一个循环需要大约 8 个小时。因此,我预计每个 运行 将近 90 个小时。我有一篇学术论文作为毕业要求到期,截止日期真的开始让我害怕这些数字了!假设我的输出是完美的,事情应该没问题,但如果有人提出可以降低速度的建议,我将永远感激不已。否则,我可能不得不缩小数据范围,从而降低统计分析的能力。
P.S。我正在 运行 通过 Spyder 在 Windows 10 上使用 Intel i7 3970X 进行此操作。我无权使用任何其他计算能力。我考虑过 GPU 加速,但我的 GPU 是 GTX 670,它不是 Pascal,因此与 CuDF 不兼容。
编辑:
这是数据 DataFrame 的后五行:
s.head()
Out[16]:
49154 65541 32791 65568 ... 24563 81910 24571 90110
DATE ...
2018-12-24 61.55 232.70000 NaN NaN ... NaN 15.71 NaN NaN
2018-12-26 65.11 244.59000 NaN NaN ... NaN 16.48 NaN NaN
2018-12-27 64.71 252.17999 NaN NaN ... NaN 16.71 NaN NaN
2018-12-28 64.96 249.64999 NaN NaN ... NaN 16.55 NaN NaN
2018-12-31 66.09 254.50000 NaN NaN ... NaN 16.74 NaN NaN
[5 rows x 1830 columns]
这里是比较数据帧之一的示例:
Out[23]:
49154 65541 32791 65568 ... 24563 81910 24571 90110
DATE ...
2018-12-24 76.3430 258.376200 NaN NaN ... NaN 19.8672 NaN NaN
2018-12-26 75.9530 258.143600 NaN NaN ... NaN 19.7980 NaN NaN
2018-12-27 75.5552 258.127199 NaN NaN ... NaN 19.7238 NaN NaN
2018-12-28 75.1382 257.878799 NaN NaN ... NaN 19.6440 NaN NaN
2018-12-31 74.7716 257.683199 NaN NaN ... NaN 19.5600 NaN NaN
[5 rows x 1830 columns]
编辑 2:
应要求,这里是 data.head()。to_dict():
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'44792': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85753': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20220': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12044': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20239': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28433': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12052': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12060': {Timestamp('1925-12-31 00:00:00'): 326.0,
Timestamp('1926-01-02 00:00:00'): 326.5,
Timestamp('1926-01-04 00:00:00'): 325.0,
Timestamp('1926-01-05 00:00:00'): 325.5,
Timestamp('1926-01-06 00:00:00'): 326.25},
'12062': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85792': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12067': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77605': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77606': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20263': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12073': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12076': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12079': {Timestamp('1925-12-31 00:00:00'): 117.5,
Timestamp('1926-01-02 00:00:00'): 124.25,
Timestamp('1926-01-04 00:00:00'): 127.125,
Timestamp('1926-01-05 00:00:00'): 123.75,
Timestamp('1926-01-06 00:00:00'): 124.5},
'61241': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12095': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28484': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'53065': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20298': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77644': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28505': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'53081': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77659': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12124': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77661': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28513': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'61284': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77668': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12140': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85869': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20343': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28548': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77702': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12167': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85908': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12183': {Timestamp('1925-12-31 00:00:00'): 78.5,
Timestamp('1926-01-02 00:00:00'): 78.0,
Timestamp('1926-01-04 00:00:00'): 77.5,
Timestamp('1926-01-05 00:00:00'): 76.875,
Timestamp('1926-01-06 00:00:00'): 76.5},
'44951': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85913': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85914': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12191': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20386': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77730': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28580': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85926': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20394': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'69550': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12212': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20407': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12220': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20415': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77768': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85963': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20431': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'45014': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'61399': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'69607': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'85991': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'53225': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20474': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20482': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86021': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'45065': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12298': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'69649': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12308': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20503': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'45081': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86041': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12319': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20511': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12343': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12345': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20554': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12369': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20562': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86102': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20570': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86111': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12394': {Timestamp('1925-12-31 00:00:00'): 123.5,
Timestamp('1926-01-02 00:00:00'): 124.0,
Timestamp('1926-01-04 00:00:00'): 123.25,
Timestamp('1926-01-05 00:00:00'): 123.5,
Timestamp('1926-01-06 00:00:00'): 122.75},
'36978': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86136': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28804': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86158': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12431': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'61583': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20626': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'77976': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'53401': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'86176': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12449': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'69796': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12456': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'45225': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'12458': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'20650': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
'28847': {Timestamp('1925-12-31 00:00:00'): nan,
Timestamp('1926-01-02 00:00:00'): nan,
Timestamp('1926-01-04 00:00:00'): nan,
Timestamp('1926-01-05 00:00:00'): nan,
Timestamp('1926-01-06 00:00:00'): nan},
...}
不幸的是,对于这个 post,我超出了 space,但是 MA_a_csv.head().to_dict() 产生与上面相同的结果,除了所有 NaN 而不是一个数据点。
也许在读取 csv 时使用 chunksize
参数。您需要尝试确定要使用的最佳大小,但我听说一个很好的经验法则是将其设置为可用内存的一半大小。
df = pd.read_csv("your.csv", chucksize=memory/2)
将结果写回文件时,您需要确保追加参数集:
df.to_csv("yourresults.csv", mode='a')
要么在每次 运行 代码时删除文件,要么确保 to_csv()
的第一次调用以写模式(默认)完成。
我会尝试的其他选项:
1) 使用 AWS EC2 等云资源并购买高规格的高内存机器,将您的数据和代码传输到它上面并让它 运行 您的代码。应该会快很多。
2) 我会考虑使用 Pyspark 之类的东西在多台机器上分配负载,但如果还不熟悉的话,这可能需要一些时间才能跟上速度。
祝你好运!
将两个简短的评论组合成一个答案。
1) 语句
j > math.log(
MA_a_csv.loc[date, security]/
data.loc[date, security]) > -j
可以通过 abs
稍微简化,例如j > abs(...)
并且可以通过单独计算一次日志并利用 log(a/b) == log(a) - log(b)
.
即使只对一个单元格进行一次计算,您也可以计算它并将其写回,以加快重新运行的速度。
2) 如果您在实际代码中有这些打印语句,它们将占用总时间的很大一部分。
我根据您提供的示例制作了自己的示例数据生成器。我认为它适合您所拥有的,但如果不适合请告诉我。如果数据匹配,请不要担心我是如何制作的细节。
rows = 6
cols = 5
np.random.seed(0)
data = pd.DataFrame(np.random.rand(rows, cols) * 100,
index=pd.DatetimeIndex(freq='d', start='1928-12-31', periods=rows))
nan_cols = len(data.columns) // 2
random_indices = zip(pd.Series(data.index.values[:-rows // 2])
.sample(nan_cols, random_state=1, replace=True),
pd.Series(data.columns).sample(nan_cols, random_state=2))
for row, col in random_indices:
data.loc[:row, col] = np.nan
MA_a_csv = data * (1 + (np.random.rand(rows, cols) / 50
* np.random.choice([-1, 1], size=(rows, cols))))
所以data
看起来像
0 1 2 3 4
1928-12-31 54.881350 71.518937 NaN 54.488318 NaN
1929-01-01 64.589411 43.758721 NaN 96.366276 38.344152
1929-01-02 79.172504 52.889492 56.804456 92.559664 7.103606
1929-01-03 8.712930 2.021840 83.261985 77.815675 87.001215
1929-01-04 97.861834 79.915856 46.147936 78.052918 11.827443
1929-01-05 63.992102 14.335329 94.466892 52.184832 41.466194
而且MA_a_csv
看起来像
0 1 2 3 4
1928-12-31 55.171734 72.626384 NaN 55.107778 NaN
1929-01-01 63.791557 44.294412 NaN 98.185186 38.867028
1929-01-02 78.603241 53.351780 57.597027 92.448175 7.008877
1929-01-03 8.829794 2.013333 83.047291 77.324770 86.368349
1929-01-04 98.977844 80.616881 45.235708 77.893620 11.876852
1929-01-05 63.785651 14.522579 94.945445 52.671519 41.668902
我运行通过看起来像你的gen_a
的东西,然后制作了一个矢量化版本,得到了相同的答案:
logs = np.log(MA_a_csv / data)
ans = ((j > logs) & (logs > -j)).replace({True: 1, False: 0})
其中 ans
是
0 1 2 3 4
1928-12-31 1 0 0 0 0
1929-01-01 0 0 0 0 0
1929-01-02 1 1 0 1 0
1929-01-03 0 1 1 1 1
1929-01-04 0 1 0 1 1
1929-01-05 1 0 1 1 1
np.log
可以一次对整个数组进行操作,并且 pandas 可能也在做一些奇特的事情来矢量化大于比较。 &
是按位和,所以它只是检查每个位置的两个条件是否都为真。
这比我的 gen_a
版本快 180 倍,后者没有 try/except 或 print 语句,因此对您的代码来说应该是一个更大的改进。
您也不需要 .replace({True: 1, False: 0})
部分 - Python 1 == True
和 0 == False
一样,因此您应该可以互换使用它们.
如果您对此有任何问题,请告诉我。如需进一步阅读,我建议阅读 Tom Augspurger 的现代 Pandas 文章 - 特别适用的是 Fast Pandas 部分。