Pandas：根据某行值获取前n列

Question

有一个只有一行的数据框，我需要将它过滤成一个较小的数据框，并根据一行中的值过滤列。
什么方法最有效？

df = pd.DataFrame({'a':[1], 'b':[10], 'c':[3], 'd':[5]})

a	b	c	d
1	10	3	5

例如前 3 个特征：

b	c	d
10	3	5

Answer 1

使用每行排序和 select 前 3 个值：

df1 = df.sort_values(0, axis=1, ascending=False).iloc[:, :3]
print (df1)
    b  d  c
0  10  5  3

Series.nlargest的解决方案：

df1 = df.iloc[0].nlargest(3).to_frame().T
print (df1)
    b  d  c
0  10  5  3

Answer 2

您可以转置 T，并使用 nlargest():

new = df.T.nlargest(columns = 0, n = 3).T

print(new)

   b  d  c
0  10  5  3

Answer 3

您可以使用 np.argsort 来获得解决方案。在下面的代码中，此 Numpy 方法按降序给出列值的索引。然后切片选择最大的 n 个值的索引。

import pandas as pd
import numpy as np

# Your dataframe
df = pd.DataFrame({'a':[1], 'b':[10], 'c':[3], 'd':[5]})

# Pick the number n to find n largest values
nlargest = 3

# Get the order of the largest value columns by their indices
order = np.argsort(-df.values, axis=1)[:, :nlargest]

# Find the columns with the largest values
top_features = df.columns[order].tolist()[0]

# Filter the dateframe by the columns
top_features_df = df[top_features]

top_features_df

输出：

    b   d   c
0   10  5   3

Pandas：根据某行值获取前n列

Pandas: Get top n columns based on a row values

python

filtering

pandas