列值取决于条件在 pandas 中的另一列
Column values which depends on another column with conditions in pandas
我有一个示例数据:
datetime temperature season
2021-04-10 01:00:00. 10. Heating season
2021-04-10 01:00:00. 26. Heating season
2021-07-10 01:00:00. 16. Cooling season
2021-07-10 01:00:00. 30. Cooling season
我想创建一个名为 new_temperature 的新列:a) 如果温度列小于 18 并且季节是供暖季节,new_temperature 应该是 25 否则 18 如果是制冷季节.
b) 如果温度列大于25且季节是制冷季节,则new_temperature列应为18,否则如果是采暖季节则应为22。
示例输出如下所示:
datetime temperature season. new_temperature
2021-04-10 01:00:00. 10. Heating season. 25
2021-04-10 01:00:00. 26. Heating season. 22
2021-07-10 01:00:00. 16. Cooling season. 18
2021-07-10 01:00:00. 30. Cooling season. 18
np.select
有 4 个条件:
cond_1 = (df.temperature < 18) & (df.season == "Heating season")
cond_2 = (df.temperature < 18) & (df.season != "Heating season")
cond_3 = (df.temperature > 25) & (df.season == "Cooling season")
cond_4 = (df.temperature > 25) & (df.season != "Cooling season")
conditions = [cond_1, cond_2, cond_3, cond_4]
choices = [25, 18, 18, 22]
df["new_temperature"] = np.select(conditions, choices)
获得
datetime temperature season new_temperature
0 2021-04-10 01:00:00. 10.0 Heating season 25
1 2021-04-10 01:00:00. 26.0 Heating season 22
2 2021-07-10 01:00:00. 16.0 Cooling season 18
3 2021-07-10 01:00:00. 30.0 Cooling season 18
注意:由于您的条件不是相互排斥的,您可能希望为 np.select
提供一个 default
值作为最后一个参数。如果没有条件匹配,它将被放入结果中。
我有一个示例数据:
datetime temperature season
2021-04-10 01:00:00. 10. Heating season
2021-04-10 01:00:00. 26. Heating season
2021-07-10 01:00:00. 16. Cooling season
2021-07-10 01:00:00. 30. Cooling season
我想创建一个名为 new_temperature 的新列:a) 如果温度列小于 18 并且季节是供暖季节,new_temperature 应该是 25 否则 18 如果是制冷季节. b) 如果温度列大于25且季节是制冷季节,则new_temperature列应为18,否则如果是采暖季节则应为22。
示例输出如下所示:
datetime temperature season. new_temperature
2021-04-10 01:00:00. 10. Heating season. 25
2021-04-10 01:00:00. 26. Heating season. 22
2021-07-10 01:00:00. 16. Cooling season. 18
2021-07-10 01:00:00. 30. Cooling season. 18
np.select
有 4 个条件:
cond_1 = (df.temperature < 18) & (df.season == "Heating season")
cond_2 = (df.temperature < 18) & (df.season != "Heating season")
cond_3 = (df.temperature > 25) & (df.season == "Cooling season")
cond_4 = (df.temperature > 25) & (df.season != "Cooling season")
conditions = [cond_1, cond_2, cond_3, cond_4]
choices = [25, 18, 18, 22]
df["new_temperature"] = np.select(conditions, choices)
获得
datetime temperature season new_temperature
0 2021-04-10 01:00:00. 10.0 Heating season 25
1 2021-04-10 01:00:00. 26.0 Heating season 22
2 2021-07-10 01:00:00. 16.0 Cooling season 18
3 2021-07-10 01:00:00. 30.0 Cooling season 18
注意:由于您的条件不是相互排斥的,您可能希望为 np.select
提供一个 default
值作为最后一个参数。如果没有条件匹配,它将被放入结果中。