添加具有布尔值的列,基于 pandas 数据框中的月份

Adding column with boolean values, based on a month in pandas data frame

我正在尝试为行 2020-01 获取 1,仅当行与列 "Jan" 交叉时。每个月都一样:

因此,总的来说,每行一次,根据月份,应该有一个 1 和几个 0。这是我试过的结果是截图,没有蓝色编辑。

columns = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov"]

for i in range(len(columns)):
    df[columns[i]] = df.TIME.astype(str).str[5] + df.TIME.astype(str).str[6]
df

蓝色编辑是目标。

我试过这样的三元运算符:

for i in range(len(columns)):
    df[columns[i]] = 1 if (df.TIME.astype(str).str[5] + df.TIME.astype(str).str[6] == "01") else 0

错误是:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

您可以使用 datetime.dt.stftime with %b formatter, get_dummies, reindex and join 回到原始 DataFrame:

# Example setup
columns = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov"]

df = pd.DataFrame({'TIME': ['2020-01', '2019-12', '2019-11', '2019-10', '2019-09']})    

df.join(pd.to_datetime(df['TIME']).dt.strftime('%b')
        .str.get_dummies()
        .reindex(columns=columns, fill_value=0))

[出局]

      TIME  Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov
0  2020-01    1    0    0    0    0    0    0    0    0    0    0
1  2019-12    0    0    0    0    0    0    0    0    0    0    0
2  2019-11    0    0    0    0    0    0    0    0    0    0    1
3  2019-10    0    0    0    0    0    0    0    0    0    1    0
4  2019-09    0    0    0    0    0    0    0    0    1    0    0

编辑

我只添加了这个,因为你特别要求它...这是一个示例,说明如何循环遍历数据框和列以更新值 - 我再次重申,这不是我想要的d 个人推荐,和上面的比较效率很低:

import datetime as dt

columns = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov"]

df = pd.DataFrame({'TIME': ['2020-01', '2019-12', '2019-11', '2019-10', '2019-09']})

for c in columns:
    for i, t in df['TIME'].iteritems():
        if dt.datetime.strptime(t, '%Y-%m').strftime('%b') == c:
            df.loc[i, c] = 1
        else:
            df.loc[i, c] = 0