根据另一列的计数填充列 python pandas

Fill column with ones based on the count of another column python pandas

下面是示例数据框

import pandas as pd
import numpy as np
from datetime import datetime
start = datetime(2011, 1, 1)
end = datetime(2012, 1, 1)
index = pd.date_range(start, end)
df = pd.DataFrame({"Trade Days": 0}, index=index)

df.iloc[0,:]=2
df.iloc[5,:]=3

如您所见,'Trade Days' 列在“2011-01-01”上有 2 个,在“2011-01-06”上有 3 个。我想根据 'Trade Days' 列中的计数值创建另一个填充 1 的列。示例输出列如下:-

df['open position']=0
df.iloc[0:2,1]=1
df.iloc[5:8,1]=1

我只能想到基于for循环的填充。有没有一种有效的方法来做到这一点? 提前致谢。

首先通过不等于 Series.ne with cumulative sum by Series.cumsum and compare first values of groups by GroupBy.transform with counter by GroupBy.cumcount for greater by Series.gt 测试非 0 值来创建组,最后将输出布尔掩码转换为 True, False1, 0 映射的整数:

g = df['Trade Days'].ne(0).cumsum()

grouped = df.groupby(g)['Trade Days']
df['new'] = grouped.transform('first').gt(grouped.cumcount()).astype(int)
print(df.head(10))

            Trade Days  open position  new
2011-01-01           2              1    1
2011-01-02           0              1    1
2011-01-03           0              0    0
2011-01-04           0              0    0
2011-01-05           0              0    0
2011-01-06           3              1    1
2011-01-07           0              1    1
2011-01-08           0              1    1
2011-01-09           0              0    0
2011-01-10           0              0    0