将 pandas DataFrame / Panel .apply() 与 index/column 信息一起使用

Question

我有一个 pandas DataFrame，如下所示：

           2016      2017      2018      2019      2020
1  0.014199  0.020776  0.016393  0.010112  0.013346
2  0.025220  0.024088  0.035357  0.026878  0.031841
3  0.016345  0.014117  0.017157  0.019280  0.017307
4  0.021467  0.020389  0.027269  0.027727  0.025750
5  0.012459  0.004377  0.015435  0.023725  0.031228

还有一个看起来像这样的函数：

def f(a,b):
   return a+b

我正在寻找一种快速（即避免循环）的方法来为数据框中的每个元素计算 f ，其中 a 是条目，b 它的列名（或索引，如果这也有效）。

输出将如下所示：

    2016             2017
1   2016.014199      2017.020776 ...
2   2016.025220      2017.024088 ...

我一直在尝试 .apply() 功能，但还没有找到如何让它工作的方法。你有什么建议吗？

KR，理查德

Answer 1

试试这个：

In [138]: df.apply(lambda x: int(x.name) + x)
Out[138]:
          2016         2017         2018         2019         2020
1  2016.014199  2017.020776  2018.016393  2019.010112  2020.013346
2  2016.025220  2017.024088  2018.035357  2019.026878  2020.031841
3  2016.016345  2017.014117  2018.017157  2019.019280  2020.017307
4  2016.021467  2017.020389  2018.027269  2019.027727  2020.025750
5  2016.012459  2017.004377  2018.015435  2019.023725  2020.031228

注意：@root 的解决方案要快得多：

In [150]: df = pd.concat([df] * 10**5, ignore_index=False)

In [151]: df.shape
Out[151]: (500000, 5)

In [152]: %timeit df.apply(lambda x: int(x.name) + x)
10 loops, best of 3: 40.7 ms per loop

In [153]: %timeit df.add(df.columns.map(int))
100 loops, best of 3: 7.95 ms per loop

Answer 2

假设您的列名是整数，您可以将 add 与列值一起使用：

df = df.add(df.columns.values)

如果列名是字符串，在使用add时使用map将列名转换为整数：

df = df.add(df.columns.map(int))

将 pandas DataFrame / Panel .apply() 与 index/column 信息一起使用

Use pandas DataFrame / Panel .apply() together with index/column information

python

apply

pandas