Pandas 布尔索引问题

Question

任何人都可以解释以下行为。我希望返回所有三行。

import pandas as pd

test_dict = {
    'col1':[None, None, None],
    'col2':[True, False, True],
    'col3':[True, True, False]
}

df = pd.DataFrame(test_dict)

df[ df.col1 | df.col2 | df.col3 ]
>>> Return only first two rows (index 0 and 1)

使用 df.fillna('') 将 None 值替换为空字符串似乎可以解决问题，但我不明白如果 None 是一个问题，为什么前两行可以正常工作。

改变比较的顺序也会影响它。如果我在掩码中交换 col2 和 col3，则不再返回索引为 1 的行，而是返回索引为 2 的行。如果 col1 排在最后，则返回所有行。

Answer 1

问题是评价是从左到右的。即

df.col1 | df.col2 | df.col3 == (df.col1 | df.col2) | df.col3

现在，我认为这是 Pandas 中的一个实现选择，None | True 被评估为 False。所以在这种情况下 (df.col1 | df.col2) 就是全部 False。这就是为什么您只看到第一行的原因。

解决这个问题。使用

df[df.any(axis=1)]

Pandas 布尔索引问题

Pandas Boolean Indexing Issue

python

boolean

dataframe

pandas