在 Pandas DataFrame 的 if-then-else 块中评估多个条件

Evaluating multiple conditions in if-then-else block in a Pandas DataFrame

我想通过评估 if-then-else 块中的多个条件在 Pandas DataFrame 中创建一个新列。

if events.hour <= 6:
    events['time_slice'] = 'night'
elif events.hour <= 12:
    events['time_slice'] = 'morning'
elif events.hour <= 18:
    events['time_slice'] = 'afternoon'
elif events.hour <= 23:
    events['time_slice'] = 'evening'

当我运行这个时,我得到以下错误:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

所以我尝试通过添加如下所示的任何语句来解决这个问题:

if (events.hour <= 6).any():
    events['time_slice'] = 'night'
elif (events.hour <= 12).any():
    events['time_slice'] = 'morning'
elif (events.hour <= 18).any():
    events['time_slice'] = 'afternoon'
elif (events.hour <= 23).any():
    events['time_slice'] = 'evening'

现在我没有收到任何错误,但是当我检查 time_slice 的唯一值时,它只显示 'night'

np.unique(events.time_slice)

array(['night'], dtype=object)

我该如何解决这个问题,因为我的数据包含应该得到 'morning'、'afternoon' 或 'evening' 的样本。谢谢!

您可以使用 pd.cut() 方法对您的数据进行分类:

演示:

In [66]: events = pd.DataFrame(np.random.randint(0, 23, 10), columns=['hour'])

In [67]: events
Out[67]:
   hour
0     5
1    17
2    12
3     2
4    20
5    22
6    20
7    11
8    14
9     8

In [71]: events['time_slice'] = pd.cut(events.hour, bins=[-1, 6, 12, 18, 23], labels=['night','morning','afternoon','evening'])

In [72]: events
Out[72]:
   hour time_slice
0     5      night
1    17  afternoon
2    12    morning
3     2      night
4    20    evening
5    22    evening
6    20    evening
7    11    morning
8    14  afternoon
9     8    morning

您可以创建一个函数:

def time_slice(hour):
    if hour <= 6:
        return 'night'
    elif hour <= 12:
        return 'morning'
    elif hour <= 18:
        return 'afternoon'
    elif hour <= 23:
        return 'evening'

那么 events['time_slice'] = events.hour.apply(time_slice) 就可以了。

这是一个 NumPy 方法 -

tags = ['night','morning','afternoon','evening']
events['time_slice'] = np.take(tags,((events.hour.values-1)//6).clip(min=0))

样本运行-

In [130]: events
Out[130]: 
   hour time_slice
0     0      night
1     8    morning
2    16  afternoon
3    20    evening
4     2      night
5    14  afternoon
6     7    morning
7    18  afternoon
8     8    morning
9    22    evening