当一个用数字索引另一个用日期索引时,如何将两个系列连接到一个数据框中
How do I join two series into a single dataframe when one is indexed with numbers and the other with dates
我在将 Y_test 数据与 predicted_tuned 数据合并时遇到问题,我已经尝试了所有可能遇到的示例,但是当我为日期时间设置索引时,我似乎仍然得到NAN 对于索引不匹配的地方是一个很大的错误,正如您在下面的尝试中看到的那样,这是许多尝试之一,日期与 df 和 df2 中的数字一样多,我刚刚将其转换为df = Y_test
我也尝试将索引设置为日期时间,但我仍然没有按照我正在寻找的那样将数字与日期对齐,
再次重申,本质上我正在尝试并排对齐两个系列并将索引设置为日期时间,但是当我这样做时,我得到了一堆 NAN 值,提前感谢您考虑帮助我这个问题!
pd.concat([df, df2])
179 0.002
180 0.003
181 0.005
182 0.006
183 0.01
...
2021-03-18 00:00:00 0.007
2021-03-25 00:00:00 0.042
2021-04-01 00:00:00 0.054
2021-04-12 00:00:00 0.011
date 179 2.037e-03
180 3.190e-03
181 4.505...
Length: 91, dtype: object
您可以通过设置 .columns
属性来重命名列。然后并排堆叠,为 concat()
指定 axis=1
最后将索引设置为 date
:
df.columns = ['date', 'predicted']
df2.columns = ['date', 'actual']
pd.concat([df, df2], axis=1).set_index('date')
我需要做的最后一项工作是首先将列重命名为 'date',方法是导出为 csv,然后使用以下代码重新导入
df2.to_csv('yo.csv')
colnames=['date', 'actual']
user1 = pd.read_csv('yo.csv', names=colnames, header=None)
df.to_csv('yo1.csv')
colnames=['date', 'predicted']
user2 = pd.read_csv('yo1.csv', names=colnames, header=None)
pd.concat([user1, user2], axis=1).set_index('date')
actual predicted
date
(nan, nan) MSFT_pred 0.000e+00
(2020-04-30 00:00:00, 179.0) 0.024201106326536603 2.037e-03
(2020-05-07 00:00:00, 180.0) -0.01686254903583162 3.190e-03
(2020-05-14 00:00:00, 181.0) 0.018717373876389054 4.505e-03
(2020-05-21 00:00:00, 182.0) -0.000981754619259867 5.655e-03
(2020-05-29 00:00:00, 183.0) 0.02132616076987759 1.038e-02
(2020-06-08 00:00:00, 184.0) 0.0030745362797475195 1.840e-02
(2020-06-15 00:00:00, 185.0) 0.059733833525184465 -8.471e-03
(2020-06-22 00:00:00, 186.0) -0.010676658312346099 1.963e-03
(2020-06-29 00:00:00, 187.0) 0.04825255850145016 1.271e-02
(2020-07-07 00:00:00, 188.0) 0.00048009595166487173 -3.963e-03
(2020-07-15 00:00:00, 189.0) 0.017675967314019658 1.315e-02
(2020-07-22 00:00:00, 190.0) -0.03699223319804901 7.459e-03
(2020-07-29 00:00:00, 191.0) 0.0425963854255107 6.393e-04
(2020-08-05 00:00:00, 192.0) -0.017767412527132542 8.299e-03
(2020-08-12 00:00:00, 193.0) 0.004849289374926791 1.229e-02
(2020-08-19 00:00:00, 194.0) 0.053163269514577394 -7.205e-04
(2020-08-26 00:00:00, 195.0) 0.04638640608165456 -2.941e-03
(2020-09-02 00:00:00, 196.0) -0.12041441937020192 1.215e-03
(2020-09-10 00:00:00, 197.0) -0.012050617841010691 1.572e-02
(2020-09-18 00:00:00, 198.0) 0.03640692855683092 1.282e-02
(2020-09-25 00:00:00, 199.0) -0.007874252996166398 1.493e-03
(2020-10-02 00:00:00, 200.0) 0.04560030760287681 6.036e-03
(2020-10-09 00:00:00, 201.0) 0.017682541657954687 6.680e-03
(2020-10-20 00:00:00, 202.0) -0.006543498136577064 3.152e-03
(2020-10-27 00:00:00, 203.0) -0.03250388362265788 -7.606e-03
(2020-11-03 00:00:00, 204.0) 0.021944140659009292 1.106e-02
(2020-11-10 00:00:00, 205.0) 0.016217814956357657 1.540e-02
(2020-11-19 00:00:00, 206.0) 0.013141777478138827 8.397e-03
(2020-11-30 00:00:00, 207.0) 0.0010271171517945987 9.058e-03
(2020-12-09 00:00:00, 208.0) 0.0347070301815533 1.084e-02
(2020-12-16 00:00:00, 209.0) 0.00790377030467937 3.130e-03
(2020-12-23 00:00:00, 210.0) 0.006314251899552481 6.853e-03
(2021-01-05 00:00:00, 211.0) -0.013723872690842853 7.528e-03
(2021-01-13 00:00:00, 212.0) 0.039115811401939204 1.702e-03
(2021-01-22 00:00:00, 213.0) 0.02625125481157209 -1.252e-02
(2021-02-02 00:00:00, 214.0) 0.01763006225325281 3.198e-03
(2021-02-09 00:00:00, 215.0) 0.0040628812983873885 6.399e-03
(2021-02-17 00:00:00, 216.0) -0.04031875139405816 4.501e-03
(2021-02-25 00:00:00, 217.0) -0.009918495072427369 1.617e-02
(2021-03-04 00:00:00, 218.0) 0.044848671154583464 7.920e-03
(2021-03-11 00:00:00, 219.0) -0.027403675161880692 1.280e-02
(2021-03-18 00:00:00, 220.0) 0.006996940936046414 1.904e-02
(2021-03-25 00:00:00, 221.0) 0.04218118715262609 9.114e-03
(2021-04-01 00:00:00, 222.0) 0.05420837163083725 4.867e-04
(2021-04-12 00:00:00, 223.0) 0.010997824269626477 2.224e-03
(date, nan) 179 2.037e-03\n180 3.190e-03\n181 4.5... NaN
我在将 Y_test 数据与 predicted_tuned 数据合并时遇到问题,我已经尝试了所有可能遇到的示例,但是当我为日期时间设置索引时,我似乎仍然得到NAN 对于索引不匹配的地方是一个很大的错误,正如您在下面的尝试中看到的那样,这是许多尝试之一,日期与 df 和 df2 中的数字一样多,我刚刚将其转换为df = Y_test
我也尝试将索引设置为日期时间,但我仍然没有按照我正在寻找的那样将数字与日期对齐,
再次重申,本质上我正在尝试并排对齐两个系列并将索引设置为日期时间,但是当我这样做时,我得到了一堆 NAN 值,提前感谢您考虑帮助我这个问题!
pd.concat([df, df2])
179 0.002
180 0.003
181 0.005
182 0.006
183 0.01
...
2021-03-18 00:00:00 0.007
2021-03-25 00:00:00 0.042
2021-04-01 00:00:00 0.054
2021-04-12 00:00:00 0.011
date 179 2.037e-03
180 3.190e-03
181 4.505...
Length: 91, dtype: object
您可以通过设置 .columns
属性来重命名列。然后并排堆叠,为 concat()
指定 axis=1
最后将索引设置为 date
:
df.columns = ['date', 'predicted']
df2.columns = ['date', 'actual']
pd.concat([df, df2], axis=1).set_index('date')
我需要做的最后一项工作是首先将列重命名为 'date',方法是导出为 csv,然后使用以下代码重新导入
df2.to_csv('yo.csv')
colnames=['date', 'actual']
user1 = pd.read_csv('yo.csv', names=colnames, header=None)
df.to_csv('yo1.csv')
colnames=['date', 'predicted']
user2 = pd.read_csv('yo1.csv', names=colnames, header=None)
pd.concat([user1, user2], axis=1).set_index('date')
actual predicted
date
(nan, nan) MSFT_pred 0.000e+00
(2020-04-30 00:00:00, 179.0) 0.024201106326536603 2.037e-03
(2020-05-07 00:00:00, 180.0) -0.01686254903583162 3.190e-03
(2020-05-14 00:00:00, 181.0) 0.018717373876389054 4.505e-03
(2020-05-21 00:00:00, 182.0) -0.000981754619259867 5.655e-03
(2020-05-29 00:00:00, 183.0) 0.02132616076987759 1.038e-02
(2020-06-08 00:00:00, 184.0) 0.0030745362797475195 1.840e-02
(2020-06-15 00:00:00, 185.0) 0.059733833525184465 -8.471e-03
(2020-06-22 00:00:00, 186.0) -0.010676658312346099 1.963e-03
(2020-06-29 00:00:00, 187.0) 0.04825255850145016 1.271e-02
(2020-07-07 00:00:00, 188.0) 0.00048009595166487173 -3.963e-03
(2020-07-15 00:00:00, 189.0) 0.017675967314019658 1.315e-02
(2020-07-22 00:00:00, 190.0) -0.03699223319804901 7.459e-03
(2020-07-29 00:00:00, 191.0) 0.0425963854255107 6.393e-04
(2020-08-05 00:00:00, 192.0) -0.017767412527132542 8.299e-03
(2020-08-12 00:00:00, 193.0) 0.004849289374926791 1.229e-02
(2020-08-19 00:00:00, 194.0) 0.053163269514577394 -7.205e-04
(2020-08-26 00:00:00, 195.0) 0.04638640608165456 -2.941e-03
(2020-09-02 00:00:00, 196.0) -0.12041441937020192 1.215e-03
(2020-09-10 00:00:00, 197.0) -0.012050617841010691 1.572e-02
(2020-09-18 00:00:00, 198.0) 0.03640692855683092 1.282e-02
(2020-09-25 00:00:00, 199.0) -0.007874252996166398 1.493e-03
(2020-10-02 00:00:00, 200.0) 0.04560030760287681 6.036e-03
(2020-10-09 00:00:00, 201.0) 0.017682541657954687 6.680e-03
(2020-10-20 00:00:00, 202.0) -0.006543498136577064 3.152e-03
(2020-10-27 00:00:00, 203.0) -0.03250388362265788 -7.606e-03
(2020-11-03 00:00:00, 204.0) 0.021944140659009292 1.106e-02
(2020-11-10 00:00:00, 205.0) 0.016217814956357657 1.540e-02
(2020-11-19 00:00:00, 206.0) 0.013141777478138827 8.397e-03
(2020-11-30 00:00:00, 207.0) 0.0010271171517945987 9.058e-03
(2020-12-09 00:00:00, 208.0) 0.0347070301815533 1.084e-02
(2020-12-16 00:00:00, 209.0) 0.00790377030467937 3.130e-03
(2020-12-23 00:00:00, 210.0) 0.006314251899552481 6.853e-03
(2021-01-05 00:00:00, 211.0) -0.013723872690842853 7.528e-03
(2021-01-13 00:00:00, 212.0) 0.039115811401939204 1.702e-03
(2021-01-22 00:00:00, 213.0) 0.02625125481157209 -1.252e-02
(2021-02-02 00:00:00, 214.0) 0.01763006225325281 3.198e-03
(2021-02-09 00:00:00, 215.0) 0.0040628812983873885 6.399e-03
(2021-02-17 00:00:00, 216.0) -0.04031875139405816 4.501e-03
(2021-02-25 00:00:00, 217.0) -0.009918495072427369 1.617e-02
(2021-03-04 00:00:00, 218.0) 0.044848671154583464 7.920e-03
(2021-03-11 00:00:00, 219.0) -0.027403675161880692 1.280e-02
(2021-03-18 00:00:00, 220.0) 0.006996940936046414 1.904e-02
(2021-03-25 00:00:00, 221.0) 0.04218118715262609 9.114e-03
(2021-04-01 00:00:00, 222.0) 0.05420837163083725 4.867e-04
(2021-04-12 00:00:00, 223.0) 0.010997824269626477 2.224e-03
(date, nan) 179 2.037e-03\n180 3.190e-03\n181 4.5... NaN