删除低于方差阈值的 pandas df 行

Question

我的 df 看起来如下（我用 pivot_table 得到的）：

ID_column Test1 Test2 Test3 Test4
ID1       0     1     3     0
ID2       4     2     0     0
ID3       3     1     3     5

我想在计算行的方差时删除所有低于方差阈值 x 的行。我在任何地方都找不到，只有为列执行此操作的解决方案。

Answer 1

您可以使用以下代码来执行此操作：

threshold = 1 # define variance threshold    
row_vars = df.var(axis=1) # calculate variance over rows.

rows_to_drop = df[row_vars>threshold].index

# drop the rows in place
df.drop(rows_to_drop, axis=0, inplace=True)

总结一下：

按行计算方差，获取方差超过此阈值的行的索引，然后将它们放在适当的位置。

删除低于方差阈值的 pandas df 行

Delete rows of pandas df under variance threshold

python

variance

pandas