python pandas 空数据框
python pandas empty Dataframe
短脚本需要这 2 个文件:
问题是获取 Date 的索引。我的索引总是空的
import pandas as pd
import numpy as np
import datetime as dt
import matplotlib.pyplot as plt
import pandas_datareader.data as web
import datetime
paper = pd.read_csv('PreisMoatStanley2013.dat',
delimiter = ' ',
parse_dates=[0,1,100,101])
data = pd.DataFrame({'GoogleWE': paper['Google End Date'],
'debt': paper['debt'].astype(np.float64),
'DJIADate': paper['DJIA Date'],
'DJIAClose': paper['DJIA Closing Price']
.astype(np.float64)})
data.set_index('DJIADate')
print paper [:5]
import quandl
#api_key = open('quandl_key.txt','r').read()
djia = pd.read_csv("djia.csv", index_col=0)
print djia[:3]
djia_closes = djia['Close'].reset_index()
djia_closes.set_index('Date')
print djia_closes[:3]
#s152
data = pd.merge(data, djia_closes,
left_on='DJIADate', right_on='Date')
data.drop(['DJIADate'], inplace=True, axis=1)
data = data.set_index('Date') # Problem seems to be here
print data[:3]
Shell 留言:
Empty DataFrame
Columns: [DJIAClose, GoogleWE, debt, Close]
Index: []
它应该是什么样子:
Date /DJIAClose /GoogleWE /debt /Close
2004-01-12 10485.18 2004-01-10 0.21 10485.2
2004-01-20 10528.66 2004-01-17 0.21 10528.7
2004-01-26 10702.51 2004-01-24 0.21 10702.5
来自data
的数据:
DJIAClose DJIADate GoogleWE debt
0 10485.18 2004-01-12 2004-01-10 0.210000
1 10528.66 2004-01-20 2004-01-17 0.210000
2 10702.51 2004-01-26 2004-01-24 0.210000
3 10499.18 2004-02-02 2004-01-31 0.213333
4 10579.03 2004-02-09 2004-02-07 0.200000
5 10714.88 2004-02-17 2004-02-14 0.203333
6 10609.62 2004-02-23 2004-02-21 0.200000
7 10678.14 2004-03-01 2004-02-28 0.200000
8 10529.48 2004-03-08 2004-03-06 0.196667
数据来自djia_closes
Date Close
2004-01-02 10409.9
2004-01-08 10592.4
2004-01-09 10458.9
2004-01-12 10485.2
2004-01-13 10427.2
2004-01-14 10538.4
2004-01-20 10528.7
2004-01-26 10702.5
data.head()
DJIAClose DJIADate GoogleWE debt
0 10485.18 2004-01-12 2004-01-10 0.210000
1 10528.66 2004-01-20 2004-01-17 0.210000
2 10702.51 2004-01-26 2004-01-24 0.210000
3 10499.18 2004-02-02 2004-01-31 0.213333
4 10579.03 2004-02-09 2004-02-07 0.200000
djia_closes.head()
Date Close
0 2004-01-02 10409.9
1 2004-01-05 10544.1
2 2004-01-06 10538.7
3 2004-01-07 10529.0
4 2004-01-08 10592.4
您在读取 paper
csv 时使用 parse_dates
选项,但在读取 djia
csv 时不使用。这可能会导致您的问题 - "date" 列 data
是一个日期时间对象,但在 djia_close
中,日期只是一个字符串。所以没有任何匹配项,这就是为什么在合并它们时得到空数据框的原因。
尝试从 read_csv
中删除 parse_dates
看看会发生什么,或者尝试在合并之前将两个日期列都转换为日期时间对象。
短脚本需要这 2 个文件:
问题是获取 Date 的索引。我的索引总是空的
import pandas as pd
import numpy as np
import datetime as dt
import matplotlib.pyplot as plt
import pandas_datareader.data as web
import datetime
paper = pd.read_csv('PreisMoatStanley2013.dat',
delimiter = ' ',
parse_dates=[0,1,100,101])
data = pd.DataFrame({'GoogleWE': paper['Google End Date'],
'debt': paper['debt'].astype(np.float64),
'DJIADate': paper['DJIA Date'],
'DJIAClose': paper['DJIA Closing Price']
.astype(np.float64)})
data.set_index('DJIADate')
print paper [:5]
import quandl
#api_key = open('quandl_key.txt','r').read()
djia = pd.read_csv("djia.csv", index_col=0)
print djia[:3]
djia_closes = djia['Close'].reset_index()
djia_closes.set_index('Date')
print djia_closes[:3]
#s152
data = pd.merge(data, djia_closes,
left_on='DJIADate', right_on='Date')
data.drop(['DJIADate'], inplace=True, axis=1)
data = data.set_index('Date') # Problem seems to be here
print data[:3]
Shell 留言:
Empty DataFrame
Columns: [DJIAClose, GoogleWE, debt, Close]
Index: []
它应该是什么样子:
Date /DJIAClose /GoogleWE /debt /Close
2004-01-12 10485.18 2004-01-10 0.21 10485.2
2004-01-20 10528.66 2004-01-17 0.21 10528.7
2004-01-26 10702.51 2004-01-24 0.21 10702.5
来自data
的数据:
DJIAClose DJIADate GoogleWE debt
0 10485.18 2004-01-12 2004-01-10 0.210000
1 10528.66 2004-01-20 2004-01-17 0.210000
2 10702.51 2004-01-26 2004-01-24 0.210000
3 10499.18 2004-02-02 2004-01-31 0.213333
4 10579.03 2004-02-09 2004-02-07 0.200000
5 10714.88 2004-02-17 2004-02-14 0.203333
6 10609.62 2004-02-23 2004-02-21 0.200000
7 10678.14 2004-03-01 2004-02-28 0.200000
8 10529.48 2004-03-08 2004-03-06 0.196667
数据来自djia_closes
Date Close
2004-01-02 10409.9
2004-01-08 10592.4
2004-01-09 10458.9
2004-01-12 10485.2
2004-01-13 10427.2
2004-01-14 10538.4
2004-01-20 10528.7
2004-01-26 10702.5
data.head()
DJIAClose DJIADate GoogleWE debt
0 10485.18 2004-01-12 2004-01-10 0.210000
1 10528.66 2004-01-20 2004-01-17 0.210000
2 10702.51 2004-01-26 2004-01-24 0.210000
3 10499.18 2004-02-02 2004-01-31 0.213333
4 10579.03 2004-02-09 2004-02-07 0.200000
djia_closes.head()
Date Close
0 2004-01-02 10409.9
1 2004-01-05 10544.1
2 2004-01-06 10538.7
3 2004-01-07 10529.0
4 2004-01-08 10592.4
您在读取 paper
csv 时使用 parse_dates
选项,但在读取 djia
csv 时不使用。这可能会导致您的问题 - "date" 列 data
是一个日期时间对象,但在 djia_close
中,日期只是一个字符串。所以没有任何匹配项,这就是为什么在合并它们时得到空数据框的原因。
尝试从 read_csv
中删除 parse_dates
看看会发生什么,或者尝试在合并之前将两个日期列都转换为日期时间对象。