在 python pandas 中分解出数据框的名称,以便更好地阅读数学表达式

Factor out the name of the dataframe in python pandas to get better to read mathematical expressions

例如,如果您对 python pandas 数据帧(称为 data)的列进行数学运算,您必须反复编写 data 访问列,这很烦人,如果你想很好地阅读数学公式。所以我正在寻找一种方法来“分解” data 关键字。考虑这个简单的例子:

import pandas as pd
from numpy import *

k = 3
data = pd.read_csv('data.dat',sep=',')

data['a4'] = data.a1 + data.a2
data['a5'] = sqrt(data.a3)*k

## Imagine much more complex mathematical operations


## instead of this I want something like this pseudocode:

## cd data
## a4 = a1 + a2
## a5 = sqrt(a3)*k
## end cd data

其中 data.dat

a1,a2,a3
1,2,3
4,5,6
7,8,9

您可以使用 pandas.DataFrame.eval:

>>> df
   a1  a2  a3
0   1   2   3
1   4   5   6
2   7   8   9

>>> k = 3

>>> df = df.eval('a4 = a1 + a2')

>>> df = df.eval('a5 = a3**2 * @k')

>>> df

   a1  a2  a3  a4   a5
0   1   2   3   3   27
1   4   5   6   9  108
2   7   8   9  15  243

如果你想把所有的都放在同一行,你可以这样做:

>>> df
   a1  a2  a3
0   1   2   3
1   4   5   6
2   7   8   9

>>> k = 3

>>> df.eval('''
     a4 = a1 + a2
     a5 = a3**2 * @k
   ''')
   a1  a2  a3  a4   a5
0   1   2   3   3   27
1   4   5   6   9  108
2   7   8   9  15  243

# Alternatively you can also store the expr in a string and then pass the string:
>>> expr = '''
     a4 = a1 + a2
     a5 = a3**2 * @k
   '''
>>> df.eval(expr)
   a1  a2  a3  a4   a5
0   1   2   3   3   27
1   4   5   6   9  108
2   7   8   9  15  243