Pandas 值未更新

Question

我是 Pandas 的新手，我正在尝试用它做一件非常简单的事情。使用 flights.csv 文件，我定义了一个新列，如果乘客人数低于平均水平，则该列定义了一个带有 underperforming 的新列，该值为 1。我的问题是逻辑可能有问题，因为它没有更新值。这是一个例子：

df = pd.read_csv('flights.csv')
 
passengers_mean = df['passengers'].mean()
df['underperforming'] = 0

for idx, row in df.iterrows():
    if (row['passengers'] < passengers_mean):
        row['underperforming'] = 1


print(df)
print(passengers_mean)

有线索吗？

Answer 1

引用documentation：

You should never modify something you are iterating over. This is not guaranteed to work in all cases. Depending on the data types, the iterator returns a copy and not a view, and writing to it will have no effect.

请使用apply()

等向量化运算

Answer 2

根据文档：

You should never modify something you are iterating over. This is not guaranteed to work in all cases.

iterrows docs

你可以做的是：

df["underperforming"] = (df.passengers < x.passengers.mean()).astype('int')

Pandas 值未更新

Pandas values not being updated

python

dataframe

pandas