重塑 pandas 数据框中的非数值

Reshaping non-numeric values in pandas dataframe

我搜索了 google 以找到答案,但没有成功。我需要重塑 pandas 数据帧,使其具有数字非数字值 (comp_url) 作为多索引数据帧中的 "value"。以下是数据示例:

    store_name sku  comp    price   ship    comp_url
     CSE      A1025 compA   30.99   9.99    some url
     CSE      A1025 compB   30.99   9.99    some url
     CSE      A1025 compC   30.99   9.99    some url

我有几个 store_name,所以我需要让它看起来像这样:

SKU      CSE                            store_name2 
       comp_url  price  ship       comp_url  price  ship
A1025  some url   30.99   9.99      some url   30.99   9.99

如有任何想法或指导,我们将不胜感激!

也许 pandas.Panel 更合适。它们用于 3 维数据。数据帧是 2d

假设每个 SKU/store_name 组合都是唯一的,下面是一个工作示例:

# imports
import pandas as pd

# Create a sample DataFrame.
cols = ['store_name', 'sku', 'comp', 'price', 'ship', 'comp_url']
records = [['CSA', 'A1025', 'compA', 30.99, 9.99, 'some url'],
           ['CSB', 'A1025', 'compB', 32.99, 9.99, 'some url2'],
           ['CSA', 'A1026', 'compC', 30.99, 19.99, 'some url'],
           ['CSB', 'A1026', 'compD', 30.99, 9.99, 'some url3']]
df = pd.DataFrame.from_records(records, columns=cols)

# Move both 'sku' and 'store_name' to the rows index; the combination
# of these two columns provide a unique identifier for each row.
df.set_index(['sku', 'store_name'], inplace=True)
# Move 'store_name' from the row index to the column index. Each
# unique value in the 'store_name' index gets its own set of columns.
# In the multiindex, 'store_name' will be below the existing column
# labels.
df = df.unstack(1)
# To get the 'store_name' above the other column labels, we simply
# reorder the levels in the MultiIndex and sort it.
df.columns = df.columns.reorder_levels([1, 0])
df.sort_index(axis=1, inplace=True)

# Show the result. 
df

之所以有效,是因为 sku/store_name 标签组合是唯一的。当我们使用 unstack() 时,我们只是在四处移动标签和单元格。我们没有做任何聚合。如果我们正在做一些没有唯一标签和需要聚合的事情,pivot_table() 可能是更好的选择。