对于每一行 return 最小值的列名 - pandas
For each row return the column names of the smallest value - pandas
假设我有一个具有以下值的数据框:
id product1sold product2sold product3sold
1 2 3 3
2 0 0 5
3 3 2 1
如何添加 'most_sold' 和 'least_sold' 列,其中包含每个 ID 列表中所有最畅销和最不畅销的产品?
它应该是这样的。
id product1 product2 product3 most_sold least_sold
1 2 3 3 [product2, product3] [product1]
2 0 0 5 [product3] [product1, product2]
3 3 2 1 [product1] [product3]
使用列表推导式测试产品列表的最小值和最大值:
#select all columns without first
df1 = df.iloc[:, 1:]
cols = df1.columns.to_numpy()
df['most_sold'] = [cols[x].tolist() for x in df1.eq(df1.max(axis=1), axis=0).to_numpy()]
df['least_sold'] = [cols[x].tolist() for x in df1.eq(df1.min(axis=1), axis=0).to_numpy()]
print (df)
id product1sold product2sold product3sold most_sold \
0 1 2 3 3 [product2sold, product3sold]
1 2 0 0 5 [product3sold]
2 3 3 2 1 [product1sold]
least_sold
0 [product1sold]
1 [product1sold, product2sold]
2 [product3sold]
如果性能不重要可以使用DataFrame.apply
:
df1 = df.iloc[:, 1:]
f = lambda x: x.index[x].tolist()
df['most_sold'] = df1.eq(df1.max(axis=1), axis=0).apply(f, axis=1)
df['least_sold'] = df1.eq(df1.min(axis=1), axis=0).apply(f, axis=1)
你可以这样做。
minValueCol = yourDataFrame.idxmin(axis=1)
maxValueCol = yourDataFrame.idxmax(axis=1)
假设我有一个具有以下值的数据框:
id product1sold product2sold product3sold
1 2 3 3
2 0 0 5
3 3 2 1
如何添加 'most_sold' 和 'least_sold' 列,其中包含每个 ID 列表中所有最畅销和最不畅销的产品? 它应该是这样的。
id product1 product2 product3 most_sold least_sold
1 2 3 3 [product2, product3] [product1]
2 0 0 5 [product3] [product1, product2]
3 3 2 1 [product1] [product3]
使用列表推导式测试产品列表的最小值和最大值:
#select all columns without first
df1 = df.iloc[:, 1:]
cols = df1.columns.to_numpy()
df['most_sold'] = [cols[x].tolist() for x in df1.eq(df1.max(axis=1), axis=0).to_numpy()]
df['least_sold'] = [cols[x].tolist() for x in df1.eq(df1.min(axis=1), axis=0).to_numpy()]
print (df)
id product1sold product2sold product3sold most_sold \
0 1 2 3 3 [product2sold, product3sold]
1 2 0 0 5 [product3sold]
2 3 3 2 1 [product1sold]
least_sold
0 [product1sold]
1 [product1sold, product2sold]
2 [product3sold]
如果性能不重要可以使用DataFrame.apply
:
df1 = df.iloc[:, 1:]
f = lambda x: x.index[x].tolist()
df['most_sold'] = df1.eq(df1.max(axis=1), axis=0).apply(f, axis=1)
df['least_sold'] = df1.eq(df1.min(axis=1), axis=0).apply(f, axis=1)
你可以这样做。
minValueCol = yourDataFrame.idxmin(axis=1)
maxValueCol = yourDataFrame.idxmax(axis=1)