如何在 pandas 中保存符合特定条件的先前结果
How to save previous result that matched certain condition in pandas
我想知道如何保存之前的结果,它符合某些条件 (df['condition'] 在后面的每一行中。我知道如何用 for 循环来做,但我知道我应该避免它们使用 pandas.
时
下面是一个例子。 df['desired_result] 列代表我想要实现的目标。
import pandas as pd
import numpy as np
dates = pd.date_range('1/1/2000', periods=10)
values = np.arange(10.0, 20.0, 1.0)
data = {'date': dates, 'value': values}
df = pd.DataFrame.from_dict(data)
df['condition'] = [False, False, True, True, False, True, False, False, True, False]
df_valid = df[df['condition']]
df['desired_result'] = [np.nan, np.nan, 12, 13, 13, 15, 15, 15, 18, 18]
# use df.where based on your condition and assign it to a new col
# Anywhere column condition is True return the value else return NaN
# then add ffill to forward fill NaN values
df['r'] = df['value'].where(df['condition'] == True, np.nan).ffill()
date value condition desired_result r
0 2000-01-01 10.0 False NaN NaN
1 2000-01-02 11.0 False NaN NaN
2 2000-01-03 12.0 True 12.0 12.0
3 2000-01-04 13.0 True 13.0 13.0
4 2000-01-05 14.0 False 13.0 13.0
5 2000-01-06 15.0 True 15.0 15.0
6 2000-01-07 16.0 False 15.0 15.0
7 2000-01-08 17.0 False 15.0 15.0
8 2000-01-09 18.0 True 18.0 18.0
9 2000-01-10 19.0 False 18.0 18.0
我想知道如何保存之前的结果,它符合某些条件 (df['condition'] 在后面的每一行中。我知道如何用 for 循环来做,但我知道我应该避免它们使用 pandas.
时下面是一个例子。 df['desired_result] 列代表我想要实现的目标。
import pandas as pd
import numpy as np
dates = pd.date_range('1/1/2000', periods=10)
values = np.arange(10.0, 20.0, 1.0)
data = {'date': dates, 'value': values}
df = pd.DataFrame.from_dict(data)
df['condition'] = [False, False, True, True, False, True, False, False, True, False]
df_valid = df[df['condition']]
df['desired_result'] = [np.nan, np.nan, 12, 13, 13, 15, 15, 15, 18, 18]
# use df.where based on your condition and assign it to a new col
# Anywhere column condition is True return the value else return NaN
# then add ffill to forward fill NaN values
df['r'] = df['value'].where(df['condition'] == True, np.nan).ffill()
date value condition desired_result r
0 2000-01-01 10.0 False NaN NaN
1 2000-01-02 11.0 False NaN NaN
2 2000-01-03 12.0 True 12.0 12.0
3 2000-01-04 13.0 True 13.0 13.0
4 2000-01-05 14.0 False 13.0 13.0
5 2000-01-06 15.0 True 15.0 15.0
6 2000-01-07 16.0 False 15.0 15.0
7 2000-01-08 17.0 False 15.0 15.0
8 2000-01-09 18.0 True 18.0 18.0
9 2000-01-10 19.0 False 18.0 18.0