Pandas 将函数应用于列组和索引

Question

给定一个由多列组成的数据框 df：

Col1   Col2   Col3   Col4   Col5   Col6
   4      2      5      3      4      1
   8      3      9      7      4      5
   1      3      6      7      4      7

我想为一组列应用一个函数func

df.apply(lambda x: func(x[['Col1', 'Col2', 'Col3']]), axis=1)

正如预期的那样工作正常。但是，使用

df.apply(lambda x: func(x.iloc[:,0:3]), axis=1)

我收到以下错误：

IndexingError: ('Too many indexers', u'occurred at index 0')

因为我想在三列组中使用循环来自动执行函数，所以我更喜欢使用 pandas iloc 或 ix作为索引方法。

有人可以解释这个错误吗？

Answer 1

您需要先删除 iloc 中的 :，因为在 apply 中使用 Series，而不是 DataFrame:

print (df.apply(lambda x: func(x.iloc[0:3]), axis=1))

测试：

def func(x):
    return x.sum()

print (df.apply(lambda x: func(x[['Col1', 'Col2', 'Col3']]), axis=1))
0    11
1    20
2    10
dtype: int64

print (df.apply(lambda x: func(x.iloc[0:3]), axis=1))
0    11
1    20
2    10
dtype: int64

你也可以通过print检查（打印return什么都没有，所以输出是None）：

print (df.apply(lambda x: print(x.iloc[0:3]), axis=1))
dtype: int64
Col1    4
Col2    2
Col3    5
Name: 0, dtype: int64
Col1    8
Col2    3
Col3    9
Name: 1, dtype: int64
Col1    1
Col2    3
Col3    6
Name: 2, dtype: int64
0    None
1    None
2    None

Pandas 将函数应用于列组和索引

Pandas apply function to groups of columns and indexing

python

indexing

multiple-columns

dataframe

pandas