Python Pandas：表示每n行在一个新列中重复n次

Question

我想每三行取 c2 列的平均值，并将结果保存在新列 c3 中，这样每个平均值重复三次。这段代码可以完成这项工作：

import pandas as pd
df = pd.DataFrame({'c1': ['A', 'B','C','D','E','F'], 'c2': [1, 2, 3,3,4,5]})
nrow=3
temp=df['c2'].rolling(nrow).mean()      #Take rolling mean
temp= temp[nrow-1::nrow]                #Select mean value every 3 rows
temp=temp.loc[temp.index.repeat(nrow)]  #Repeat each mean value 3 times
temp.index = range(0,len(df))           #Fix index 
df['c3']=temp
print(df)

结果应该是第 c3 列和 [2,2,2,4,4,4]。有没有比这 5 行代码更简单的方法？

Answer 1

使用 GroupBy.transform 整数除法索引或辅助数组的长度 DataFrame:

nrow = 3

#if default RangeIndex
df['c3'] = df.groupby(df.index // nrow)['c2'].transform('mean')

#alternative if not default RangeIndex
#df['c3'] = df.groupby(np.arange(df) // nrow)['c2'].transform('mean')
print(df)

  c1  c2  c3
0  A   1   2
1  B   2   2
2  C   3   2
3  D   3   4
4  E   4   4
5  F   5   4

Python Pandas：表示每n行在一个新列中重复n次

Python Pandas: Mean every n rows in a new column repeated n times

python

mean

repeat

pandas