从数据框中删除行,其中从第三列开始的每个值都是 0

Drop rows from dataframe, where every value from thirdcolumn onwards is 0

我正在尝试删除从第三列开始值为 0 的行

我使用了下面的代码,它有效,但我觉得必须有更有效的方法来做到这一点,这是我的数据框:

NRC_lexicon_wide = NRC_lexicon_wide[~((NRC_lexicon_wide['anger'] == 0) & (NRC_lexicon_wide['anticipation'] == 0) 
                                      & (NRC_lexicon_wide['disgust'] == 0) & (NRC_lexicon_wide['fear'] == 0) 
                                      & (NRC_lexicon_wide['negative'] == 0) & (NRC_lexicon_wide['positive'] == 0) 
                                      & (NRC_lexicon_wide['sadness'] == 0) & (NRC_lexicon_wide['surprise'] == 0)
                                      & (NRC_lexicon_wide['trust'] == 0))]

好的,这个怎么样:

import pandas
import numpy


# Create a dataframe from a list of dicts will automatically find the column
df = pandas.DataFrame(pandas.DataFrame([{key: numpy.random.choice([0, 1, 2], p=[0.8, 0.15, 0.05]) for key in ["ColA", "ColB", "ColC", "ColD", "ColE", "ColF"]} for _ in range(50)]))

# Start from this column onwards
start_column = 3

# Get a boolean value for each cell, indicating if the value is larger than 0
larger_than_zero = df.loc[:, df.columns[start_column:]] > 0

# Get the rows for which any value in a cell is larger than 0
any_cell_larger_than_zero = larger_than_zero.any(axis=1)

# Select only the rows that have cells larger than 0
df = df.loc[any_cell_larger_than_zero]

# Or in a single line:
df = df.loc[(df.loc[:, df.columns[3:]] > 0).any(axis=1)]