如何使用现有数据框合并和重复数据框中的人口单元格?

How do I merge and repeat population cells in a data frame with existing data frame?

我有一个关于美国酒后驾车的数据框。它按州和年份列出所有事件。每年有多个条目。我的每个州数据框的单独人口规模每年有一个条目。如何将人口数据框中的人口列添加到酒后驾车数据框中,并每年重复多次输入?现在它只是在酒后驾驶数据框下面添加人口数据框,而不合并它们。我将不胜感激任何人的帮助。坚持了几天。

我尝试了多种不同的方法,使用 concat、merge、append 等。

df = pd.concat([df, df_pops], sort=False)
df = pd.merge(df, df_pops)

我需要最终数据框如下所示:

STATE      MONTH YEAR FATALS DRUNK_DR POPULATION
Oregon     1     2017   1       1      4,146,600
Oregon     2     2017   0       1      4,146,600
Oregon     3     2017   1       2      4,146,600
...

这是我得到的:

         STATE         MONTH    YEAR    FATALS  DRUNK_DR  POPULATION
5619    Oregon          1.0     2017    1.0      0.0        NaN
5620    Oregon          1.0     2017    1.0      0.0        NaN
5621    Oregon          1.0     2017    1.0      0.0        NaN
... ... ... ... ... ... ...
30      Oregon          NaN     2017    NaN      NaN       4,146,600
31      Oregon          NaN     2016    NaN      NaN       4,091,400
32      Oregon          NaN     2015    NaN      NaN       4,016,900

合并正确;你只是缺少语法。我建议阅读非常有帮助的 pandas documentation on merge.

df1 = pd.DataFrame({'STATE': {0: 'Oregon', 1: 'Oregon', 2: 'Oregon'},
                    'MONTH': {0: 1.0, 1: 1.0, 2: 1.0},
                    'YEAR': {0: 2017, 1: 2017, 2: 2017},
                    'FATALS': {0: 1.0, 1: 1.0, 2: 1.0},
                    'DRUNK_DR': {0: 0.0, 1: 0.0, 2: 0.0}})

df2 = pd.DataFrame({'STATE': {0: 'Oregon', 1: 'Oregon', 2: 'Oregon'},
                    'YEAR': {0: 2017, 1: 2016, 2: 2015},
                    'POPULATION': {0: '4,146,600', 1: '4,091,400', 2: '4,016,900'}})

merged = df1.merge(df2, how='left', left_on=['STATE','YEAR'], right_on=['STATE','YEAR'])

给予

    STATE  MONTH  YEAR  FATALS  DRUNK_DR POPULATION
0  Oregon    1.0  2017     1.0       0.0  4,146,600
1  Oregon    1.0  2017     1.0       0.0  4,146,600
2  Oregon    1.0  2017     1.0       0.0  4,146,600