转置 pandas 数据框中的列，同时保持其他列完整无缺

Question

我的数据框如下

selection_id  last_traded_price
430494        1.46
430494        1.48
430494        1.56
430494        1.57
430495        2.45
430495        2.67
430495        2.72
430495        2.87

我有很多包含选择 ID 的行，我需要保持 selection_id 列不变，但将最后交易价格中的数据转置为如下所示。

selection_id  last_traded_price
430494        1.46              1.48          1.56      1.57    e.t.c 
430495        2.45              2.67          2.72      2.87    e.t.c

我尝试过使用枢轴

   (df.pivot(index='selection_id', columns=last_traded_price', values='last_traded_price')

由于 selection_id 中的重复行，数据透视表无法正常工作。是否可以先转置数据然后删除重复数据？

Answer 1

选项 1
groupby + apply

v = df.groupby('selection_id').last_traded_price.apply(list)
pd.DataFrame(v.tolist(), index=v.index)

                 0     1     2     3
selection_id                        
430494        1.46  1.48  1.56  1.57
430495        2.45  2.67  2.72  2.87

选项 2
你可以使用 pivot 执行此操作，只要你有另一列计数要传递给旋转（它需要沿着 某些东西[=旋转28=], 这就是为什么).

df['Count'] = df.groupby('selection_id').cumcount() df.pivot('selection_id', 'Count', 'last_traded_price') Count 0 1 2 3 selection_id 430494 1.46 1.48 1.56 1.57 430495 2.45 2.67 2.72 2.87

Answer 2

您可以使用 cumcount for Counter for new columns names created by set_index + unstack or pandas.pivot:

g = df.groupby('selection_id').cumcount()
df = df.set_index(['selection_id',g])['last_traded_price'].unstack()
print (df)
                 0     1     2     3
selection_id                        
430494        1.46  1.48  1.56  1.57
430495        2.45  2.67  2.72  2.87

与pivot类似的解决方案：

df = pd.pivot(index=df['selection_id'], 
              columns=df.groupby('selection_id').cumcount(), 
              values=df['last_traded_price'])
print (df)
                 0     1     2     3
selection_id                        
430494        1.46  1.48  1.56  1.57
430495        2.45  2.67  2.72  2.87

转置 pandas 数据框中的列，同时保持其他列完整无缺

Transposing a column in a pandas dataframe while keeping other column intact with duplicates

python

csv

transpose

dataframe

pandas