如何使用 apply 函数规范化列 pandas
How to use apply function to normalize columns pandas
我需要使用以下公式规范化我的列:
x/total * 100
我在这里创建了自定义函数:
def normalize(df):
result=[]
total= df[col].sum()
for value in df[col]:
result.append(value/total *100)
return result
正在此处应用该功能。我得到 TypeError: normalize() got an unexpected keyword argument 'axis'
df['columnname'].apply(normalize, axis=1)
请问我该怎么做? Or/and 有更有效的方法吗?谢谢
您需要将规范化更改为 -
def normalize(inp):
inp = inp.values
result=[]
total= sum(inp)
for value in inp:
result.append(value/total *100)
return result
#### Using Numpy Directly , avoiding the for loop
def normalize_numpy(inp):
total= sum(inp)
return inp/total * 100
>>> l = [100,34,56,71,2,4,5,2,10]
>>> v = [1,2,3,4,5,68,1,2,3]
>>> df = pd.DataFrame(data=list(zip(l,v)),columns=['Col1','Col2'])
Multiple Column Usage
>>> df[['Col1','Col2']].apply(lambda x:normalize_numpy(x),axis=0)
Col1 Col2
0 35.211268 1.123596
1 11.971831 2.247191
2 19.718310 3.370787
3 25.000000 4.494382
4 0.704225 5.617978
5 1.408451 76.404494
6 1.760563 1.123596
7 0.704225 2.247191
8 3.521127 3.370787
Single Column Usage
>>> df[['Col2']].apply(normalize_numpy,axis=0)
Value
0 1.123596
1 2.247191
2 3.370787
3 4.494382
4 5.617978
5 76.404494
6 1.123596
7 2.247191
8 3.370787
我需要使用以下公式规范化我的列:
x/total * 100
我在这里创建了自定义函数:
def normalize(df):
result=[]
total= df[col].sum()
for value in df[col]:
result.append(value/total *100)
return result
正在此处应用该功能。我得到 TypeError: normalize() got an unexpected keyword argument 'axis'
df['columnname'].apply(normalize, axis=1)
请问我该怎么做? Or/and 有更有效的方法吗?谢谢
您需要将规范化更改为 -
def normalize(inp):
inp = inp.values
result=[]
total= sum(inp)
for value in inp:
result.append(value/total *100)
return result
#### Using Numpy Directly , avoiding the for loop
def normalize_numpy(inp):
total= sum(inp)
return inp/total * 100
>>> l = [100,34,56,71,2,4,5,2,10]
>>> v = [1,2,3,4,5,68,1,2,3]
>>> df = pd.DataFrame(data=list(zip(l,v)),columns=['Col1','Col2'])
Multiple Column Usage
>>> df[['Col1','Col2']].apply(lambda x:normalize_numpy(x),axis=0)
Col1 Col2
0 35.211268 1.123596
1 11.971831 2.247191
2 19.718310 3.370787
3 25.000000 4.494382
4 0.704225 5.617978
5 1.408451 76.404494
6 1.760563 1.123596
7 0.704225 2.247191
8 3.521127 3.370787
Single Column Usage
>>> df[['Col2']].apply(normalize_numpy,axis=0)
Value
0 1.123596
1 2.247191
2 3.370787
3 4.494382
4 5.617978
5 76.404494
6 1.123596
7 2.247191
8 3.370787