Pandas 相关错误 - 小数和浮点类型不匹配
Pandas correlation error - decimal and float type mismatch
此问题已有been raised here,但尚未得到解答。我在此线程中提供了更多详细信息,希望能使果汁流淌。
我有一个包含时间序列数据的 pandas 数据框 master_frame
:
SUBMIT_DATE CRUX_VOL CRUX_RATE
0 2016-02-01 76.38733173161 0.02832710529
1 2016-01-31 76.68984699154 0.02720243998
2 2016-01-30 75.59094829615 0.02720243998
3 2016-01-29 75.91758975956 0.02720243998
4 2016-01-28 76.31809997200 0.02671927211
... ... ... ...
我想要 CRUX_VOL
和 CRUX_RATE
列之间的相关性。都是小数类型:
ln[3]: print type(master_frame["CRUX_VOL"][0]), type(master_frame["CRUX_RATE"][0])
out[3]: <class 'decimal.Decimal'> <class 'decimal.Decimal'>
当我使用 corr 函数时,出现与输入类型相关的严重错误。
print master_frame['CRUX_VOL'].corr(master_frame['CRUX_RATE'])
Traceback (most recent call last):
File "U:/Programming/VolPathReport/VolPath.py", line 52, in <module>
print master_frame['CRUX_VOL'].corr(master_frame['CRUX_RATE'])
File "C:\Anaconda2\lib\site-packages\pandas\core\series.py", line 1312, in corr
min_periods=min_periods)
File "C:\Anaconda2\lib\site-packages\pandas\core\nanops.py", line 47, in _f
return f(*args, **kwargs)
File "C:\Anaconda2\lib\site-packages\pandas\core\nanops.py", line 644, in nancorr
return f(a, b)
File "C:\Anaconda2\lib\site-packages\pandas\core\nanops.py", line 652, in _pearson
return np.corrcoef(a, b)[0, 1]
File "C:\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 2145, in corrcoef
c = cov(x, y, rowvar)
File "C:\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 2065, in cov
avg, w_sum = average(X, axis=1, weights=w, returned=True)
File "C:\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 599, in average
scl = np.multiply(avg, 0) + scl
TypeError: unsupported operand type(s) for +: 'Decimal' and 'float'
我弄乱了类型,无法使它正常工作。帮助我,互联网的向导们!
错误信息的最后一行指向
np.multiply(avg, 0) + scl
作为
的原因
TypeError: unsupported operand type(s) for +: 'Decimal' and 'float'
我不认为 numpy
有 Decimal
类型,所以 np.multiply
returns float
,然后不与 [= 协作15=] 在使用 +
运算符时。由于 pandas
依赖于 numpy
,因此最好使用
将 DataFrame
转换为 float
dtype
master_frame.loc[:, ['CRUX_VOL', 'CRUX_RATE']].astype(float)
或
master_frame.convert_objects(convert_numeric=True)
此问题已有been raised here,但尚未得到解答。我在此线程中提供了更多详细信息,希望能使果汁流淌。
我有一个包含时间序列数据的 pandas 数据框 master_frame
:
SUBMIT_DATE CRUX_VOL CRUX_RATE
0 2016-02-01 76.38733173161 0.02832710529
1 2016-01-31 76.68984699154 0.02720243998
2 2016-01-30 75.59094829615 0.02720243998
3 2016-01-29 75.91758975956 0.02720243998
4 2016-01-28 76.31809997200 0.02671927211
... ... ... ...
我想要 CRUX_VOL
和 CRUX_RATE
列之间的相关性。都是小数类型:
ln[3]: print type(master_frame["CRUX_VOL"][0]), type(master_frame["CRUX_RATE"][0])
out[3]: <class 'decimal.Decimal'> <class 'decimal.Decimal'>
当我使用 corr 函数时,出现与输入类型相关的严重错误。
print master_frame['CRUX_VOL'].corr(master_frame['CRUX_RATE'])
Traceback (most recent call last):
File "U:/Programming/VolPathReport/VolPath.py", line 52, in <module>
print master_frame['CRUX_VOL'].corr(master_frame['CRUX_RATE'])
File "C:\Anaconda2\lib\site-packages\pandas\core\series.py", line 1312, in corr
min_periods=min_periods)
File "C:\Anaconda2\lib\site-packages\pandas\core\nanops.py", line 47, in _f
return f(*args, **kwargs)
File "C:\Anaconda2\lib\site-packages\pandas\core\nanops.py", line 644, in nancorr
return f(a, b)
File "C:\Anaconda2\lib\site-packages\pandas\core\nanops.py", line 652, in _pearson
return np.corrcoef(a, b)[0, 1]
File "C:\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 2145, in corrcoef
c = cov(x, y, rowvar)
File "C:\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 2065, in cov
avg, w_sum = average(X, axis=1, weights=w, returned=True)
File "C:\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 599, in average
scl = np.multiply(avg, 0) + scl
TypeError: unsupported operand type(s) for +: 'Decimal' and 'float'
我弄乱了类型,无法使它正常工作。帮助我,互联网的向导们!
错误信息的最后一行指向
np.multiply(avg, 0) + scl
作为
的原因TypeError: unsupported operand type(s) for +: 'Decimal' and 'float'
我不认为 numpy
有 Decimal
类型,所以 np.multiply
returns float
,然后不与 [= 协作15=] 在使用 +
运算符时。由于 pandas
依赖于 numpy
,因此最好使用
DataFrame
转换为 float
dtype
master_frame.loc[:, ['CRUX_VOL', 'CRUX_RATE']].astype(float)
或
master_frame.convert_objects(convert_numeric=True)