将具有不同长度和非唯一索引的多索引系列转换为 Dataframe

Question

0  6    1689.306931
   6     345.198020
   6     226.217822
   6      34.574257
   6      14.000000
           ...     
3  6       1.077353
   6       1.116176
   6       1.078431
   6       1.049020
   6       0.980294

这是我的多索引系列 my_df，我可以在其中使用 my_df.loc[0]

访问每个相同索引

6    1689.306931
6     345.198020
6     226.217822
6      34.574257
6      14.000000
6      63.683168
6      60.158416
6      60.198020
6      18.811881
6      22.316832

dtype: float64

考虑到每个系列的长度不完全相同，我如何将这个多索引系列转换为数据帧 w/o 抛出错误：

ValueError：无法从重复轴重新索引

pd.unstack() 抛出：

ValueError：索引包含重复条目，无法重塑

Answer 1

尝试使用 groupby.cumcount 枚举一层内的行，然后取消堆叠：

# your first level seem to be identical, just drop it
(df.reset_index(level=1)           
   .set_index(df.groupby(level=0).cumcount(), append=True)
   .unstack()
)

将具有不同长度和非唯一索引的多索引系列转换为 Dataframe

Turn multi-indexed series with different lengths and non-unique indexes into Dataframe

concatenation

series

dataframe

pandas