将已处理的 pandas 个数据帧相加
Adding processed pandas DataFrames together
我正在尝试将两个 DataFrame 添加到 Python 中,首先将它们的索引列设置为等于现有列之一。
尽管在以下线程中使用评价最高的方法会出错:
(参见- Adding two pandas dataframes)
下面是一个简单的问题示例:
import pandas as pd
import numpy as np
a = np.array([['A',1.,2.,3.],['B',1.,2.,3.],['C',1.,2.,3.]])
a = pd.DataFrame(a)
a = a.set_index(0)
a
1 2 3
0
A 1.0 2.0 3.0
B 1.0 2.0 3.0
C 1.0 2.0 3.0
b = np.array([['A',1.,2.,3.],['B',1.,2.,3.]])
b = pd.DataFrame(b)
b.set_index(0)
b
1 2 3
0
A 1.0 2.0 3.0
B 1.0 2.0 3.0
df_add = a.add(b,fill_value=1)
错误:
Traceback (most recent call last):
File "<ipython-input-150-885d92411f6c>", line 1, in <module>
df_add = a.add(b,fill_value=1)
File "/home/anaconda3/lib/python3.6/site-packages/pandas/core/ops.py", line 1234, in f
return self._combine_frame(other, na_op, fill_value, level)
File "/home/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 3490, in _combine_frame
result = _arith_op(this.values, other.values)
File "/home/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 3459, in _arith_op
return func(left, right)
File "/home/anaconda3/lib/python3.6/site-packages/pandas/core/ops.py", line 1195, in na_op
result[mask] = op(xrav, yrav)
TypeError: must be str, not int
如果您能提供有关防止此问题的任何帮助,我们将不胜感激。
问题出在定义的 DataFrame 中 - 所有数据都转换为二维 numpy 数组中的字符串:
a = np.array([['A',1.,2.,3.],['B',1.,2.,3.],['C',1.,2.,3.]])
print (a)
[['A' '1.0' '2.0' '3.0']
['B' '1.0' '2.0' '3.0']
['C' '1.0' '2.0' '3.0']]
解决方案是删除字符串值并按列表指定索引:
a = np.array([[1.,2.,3.],[1.,2.,3.],[1.,2.,3.]])
a = pd.DataFrame(a, index=list('ABC'))
b = np.array([[1.,2.,3.],[1.,2.,3.]])
b = pd.DataFrame(b, index=list('AB'))
df_add = a.add(b,fill_value=1)
print (df_add)
0 1 2
A 2.0 4.0 6.0
B 2.0 4.0 6.0
C 2.0 3.0 4.0
或将索引设置为float
s后转换DataFrames:
a = np.array([['A',1.,2.,3.],['B',1.,2.,3.],['C',1.,2.,3.]])
a = pd.DataFrame(a)
a = a.set_index(0).astype(float)
b = np.array([['A',1.,2.,3.],['B',1.,2.,3.]])
b = pd.DataFrame(b)
b = b.set_index(0).astype(float)
df_add = a.add(b,fill_value=1)
print (df_add)
1 2 3
0
A 2.0 4.0 6.0
B 2.0 4.0 6.0
C 2.0 3.0 4.0
我正在尝试将两个 DataFrame 添加到 Python 中,首先将它们的索引列设置为等于现有列之一。
尽管在以下线程中使用评价最高的方法会出错:
(参见- Adding two pandas dataframes)
下面是一个简单的问题示例:
import pandas as pd
import numpy as np
a = np.array([['A',1.,2.,3.],['B',1.,2.,3.],['C',1.,2.,3.]])
a = pd.DataFrame(a)
a = a.set_index(0)
a
1 2 3
0
A 1.0 2.0 3.0
B 1.0 2.0 3.0
C 1.0 2.0 3.0
b = np.array([['A',1.,2.,3.],['B',1.,2.,3.]])
b = pd.DataFrame(b)
b.set_index(0)
b
1 2 3
0
A 1.0 2.0 3.0
B 1.0 2.0 3.0
df_add = a.add(b,fill_value=1)
错误:
Traceback (most recent call last):
File "<ipython-input-150-885d92411f6c>", line 1, in <module>
df_add = a.add(b,fill_value=1)
File "/home/anaconda3/lib/python3.6/site-packages/pandas/core/ops.py", line 1234, in f
return self._combine_frame(other, na_op, fill_value, level)
File "/home/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 3490, in _combine_frame
result = _arith_op(this.values, other.values)
File "/home/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 3459, in _arith_op
return func(left, right)
File "/home/anaconda3/lib/python3.6/site-packages/pandas/core/ops.py", line 1195, in na_op
result[mask] = op(xrav, yrav)
TypeError: must be str, not int
如果您能提供有关防止此问题的任何帮助,我们将不胜感激。
问题出在定义的 DataFrame 中 - 所有数据都转换为二维 numpy 数组中的字符串:
a = np.array([['A',1.,2.,3.],['B',1.,2.,3.],['C',1.,2.,3.]])
print (a)
[['A' '1.0' '2.0' '3.0']
['B' '1.0' '2.0' '3.0']
['C' '1.0' '2.0' '3.0']]
解决方案是删除字符串值并按列表指定索引:
a = np.array([[1.,2.,3.],[1.,2.,3.],[1.,2.,3.]])
a = pd.DataFrame(a, index=list('ABC'))
b = np.array([[1.,2.,3.],[1.,2.,3.]])
b = pd.DataFrame(b, index=list('AB'))
df_add = a.add(b,fill_value=1)
print (df_add)
0 1 2
A 2.0 4.0 6.0
B 2.0 4.0 6.0
C 2.0 3.0 4.0
或将索引设置为float
s后转换DataFrames:
a = np.array([['A',1.,2.,3.],['B',1.,2.,3.],['C',1.,2.,3.]])
a = pd.DataFrame(a)
a = a.set_index(0).astype(float)
b = np.array([['A',1.,2.,3.],['B',1.,2.,3.]])
b = pd.DataFrame(b)
b = b.set_index(0).astype(float)
df_add = a.add(b,fill_value=1)
print (df_add)
1 2 3
0
A 2.0 4.0 6.0
B 2.0 4.0 6.0
C 2.0 3.0 4.0