如何将列添加到 CSV 文件中

Question

我需要在df_canada中添加一列来计算AreaName在加拿大的移民总数：

df_canada = pd.read_csv('https://raw.githubusercontent.com/iikotelnikov/datasets/main/canada_immigration.csv', sep=';')
df_canada

首先，我添加了一个额外的行来计算加拿大每年的移民总数。

# Here we add cell for sum of immigrants in Canada by year
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.animation as ani
import datetime as dt
%matplotlib inline 
df_canada.loc[197] = {'Type': 'Sum of immigrants in Canada by year'}
df_canada.loc[197, 10:] = df_canada[df_canada['Type'] != 'Sum of immigrants in Canada by year'].iloc[:, 10:].sum()
df_canada

其次，我需要通过AreaName来计算加拿大移民总数。

# Here we add cell for sum of immigrants in Canada by Area
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.animation as ani
import datetime as dt
%matplotlib inline 
df_canada.loc[198] = {'Type': 'Sum of immigrants in Canada by Area'}
df_canada.loc[198, 10:] = df_canada[df_canada['Type'] != 'Sum of immigrants in Canada by year'].iloc[:, 10:].sum()

但是不适合我

我不知道下一步是什么。

你能告诉我如何按地区计算加拿大的移民总数并以此数创建列吗？

Answer 1

你需要这样做！

导入dataframe中的数据
由于有些值是行而不是单列，您需要使用 melt 将行转换为列
对每一行应用区域名称和年份和总和的groupby
将输出加载到文件。

代码从这里开始 ->

df_canada = pd.read_csv('https://raw.githubusercontent.com/iikotelnikov/datasets/main/canada_immigration.csv', sep=';') 
agg_df_candaa = df_canada.melt(id_vars=['Type', 'Coverage', 'OdName', 'AREA', 'AreaName', 'REG', 'RegName',
       'DEV', 'DevName'], var_name='year', value_name="value")
result_df = agg_df_candaa.groupby(['AreaName','year'])['value'].sum()
result_df.to_csv('Your_location/result_df.csv')

如何将列添加到 CSV 文件中

How to add a column into a CSV file

python

csv

dataframe

python-3.x