Override/move 特定列中从底部行到顶部行的值 (pandas)
Override/move values from bottom rows to upper rows in specific columns (pandas)
我有一个如下所示的数据框,我想从下半部分移动 'phone'、'spotify' 和 'rent' 的值并覆盖上半部分(本质上将数据帧一分为二并将 'expense' 值放置到 'income' 的一半。
目前,一月到十二月有两次。我希望它只有 12 行,每个单元格中都有值(即没有单元格具有 0.0 作为值)。
loan csn salary phone spotify rent
january income 1200.0 13000.0 2000.0 0.0 0.0 0.0
february income 1200.0 13000.0 2000.0 0.0 0.0 0.0
march income 1200.0 13000.0 2000.0 0.0 0.0 0.0
april income 1200.0 13000.0 2000.0 0.0 0.0 0.0
may income 1200.0 13000.0 2000.0 0.0 0.0 0.0
june income 1200.0 13000.0 2000.0 0.0 0.0 0.0
july income 1200.0 13000.0 2000.0 0.0 0.0 0.0
august income 1200.0 13000.0 2000.0 0.0 0.0 0.0
september income 1200.0 13000.0 2000.0 0.0 0.0 0.0
october income 1200.0 13000.0 2000.0 0.0 0.0 0.0
november income 1200.0 13000.0 2000.0 0.0 0.0 0.0
december income 1200.0 13000.0 2000.0 0.0 0.0 0.0
january expense 0.0 0.0 0.0 300.0 49.0 3500.0
february expense 0.0 0.0 0.0 300.0 149.0 3500.0
march expense 0.0 0.0 0.0 300.0 49.0 3500.0
april expense 0.0 0.0 0.0 300.0 49.0 3500.0
may expense 0.0 0.0 0.0 300.0 49.0 3500.0
june expense 0.0 0.0 0.0 300.0 49.0 3500.0
july expense 0.0 0.0 0.0 300.0 49.0 3500.0
august expense 0.0 0.0 0.0 300.0 49.0 3500.0
september expense 0.0 0.0 0.0 300.0 49.0 3500.0
october expense 0.0 0.0 0.0 300.0 49.0 3500.0
november expense 0.0 0.0 0.0 300.0 49.0 3500.0
december expense 0.0 0.0 0.0 300.0 49.0 3500.0
正在从中获取数据。JSON:
df_all = pd.DataFrame.from_dict({(i,j): data[i][j]
for i in data.keys()
for j in data[i].keys()},
orient='index')
.JSON 文件结构:
{
"january": {
"income": {
"loan": 1200,
"csn": 13000,
"salary": 2000
},
"expense": {
"phone": 300,
"spotify": 49,
"rent": 3500
}
...
期望输出:
loan csn salary phone spotify rent
january income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
february income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
march income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
april income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
may income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
june income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
july income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
august income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
september income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
october income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
november income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
december income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
这是一种方法:
df = df.rename(index={'expense':'income'}, level=1).fillna(0).groupby(level=[0,1]).sum()
df
输出:
loan csn Salary phone spotify rent
Apr income 1200 13000 2000.0 300 49 3500
Aug income 1200 13000 2000.0 300 49 3500
Dec income 1200 13000 2000.0 300 49 3500
Feb income 1200 13000 2000.0 300 49 3500
Jan income 1200 13000 2000.0 300 49 3500
Jul income 1200 13000 2000.0 300 49 3500
Jun income 1200 13000 2000.0 300 49 3500
Mar income 1200 13000 2000.0 300 49 3500
May income 1200 13000 2000.0 300 49 3500
Nov income 1200 13000 2000.0 300 49 3500
Oct income 1200 13000 2000.0 300 49 3500
Sep income 1200 13000 2000.0 300 49 3500
详情:
重命名索引级别 1,使 'expense' 变为 'income',然后使用索引的两个级别 groupby
。我们可以使用 first
,但我不认为未来证明和安全,因此,我选择 fillna
零和 sum
。
我有一个如下所示的数据框,我想从下半部分移动 'phone'、'spotify' 和 'rent' 的值并覆盖上半部分(本质上将数据帧一分为二并将 'expense' 值放置到 'income' 的一半。
目前,一月到十二月有两次。我希望它只有 12 行,每个单元格中都有值(即没有单元格具有 0.0 作为值)。
loan csn salary phone spotify rent
january income 1200.0 13000.0 2000.0 0.0 0.0 0.0
february income 1200.0 13000.0 2000.0 0.0 0.0 0.0
march income 1200.0 13000.0 2000.0 0.0 0.0 0.0
april income 1200.0 13000.0 2000.0 0.0 0.0 0.0
may income 1200.0 13000.0 2000.0 0.0 0.0 0.0
june income 1200.0 13000.0 2000.0 0.0 0.0 0.0
july income 1200.0 13000.0 2000.0 0.0 0.0 0.0
august income 1200.0 13000.0 2000.0 0.0 0.0 0.0
september income 1200.0 13000.0 2000.0 0.0 0.0 0.0
october income 1200.0 13000.0 2000.0 0.0 0.0 0.0
november income 1200.0 13000.0 2000.0 0.0 0.0 0.0
december income 1200.0 13000.0 2000.0 0.0 0.0 0.0
january expense 0.0 0.0 0.0 300.0 49.0 3500.0
february expense 0.0 0.0 0.0 300.0 149.0 3500.0
march expense 0.0 0.0 0.0 300.0 49.0 3500.0
april expense 0.0 0.0 0.0 300.0 49.0 3500.0
may expense 0.0 0.0 0.0 300.0 49.0 3500.0
june expense 0.0 0.0 0.0 300.0 49.0 3500.0
july expense 0.0 0.0 0.0 300.0 49.0 3500.0
august expense 0.0 0.0 0.0 300.0 49.0 3500.0
september expense 0.0 0.0 0.0 300.0 49.0 3500.0
october expense 0.0 0.0 0.0 300.0 49.0 3500.0
november expense 0.0 0.0 0.0 300.0 49.0 3500.0
december expense 0.0 0.0 0.0 300.0 49.0 3500.0
正在从中获取数据。JSON:
df_all = pd.DataFrame.from_dict({(i,j): data[i][j]
for i in data.keys()
for j in data[i].keys()},
orient='index')
.JSON 文件结构:
{
"january": {
"income": {
"loan": 1200,
"csn": 13000,
"salary": 2000
},
"expense": {
"phone": 300,
"spotify": 49,
"rent": 3500
}
...
期望输出:
loan csn salary phone spotify rent
january income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
february income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
march income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
april income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
may income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
june income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
july income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
august income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
september income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
october income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
november income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
december income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
这是一种方法:
df = df.rename(index={'expense':'income'}, level=1).fillna(0).groupby(level=[0,1]).sum()
df
输出:
loan csn Salary phone spotify rent
Apr income 1200 13000 2000.0 300 49 3500
Aug income 1200 13000 2000.0 300 49 3500
Dec income 1200 13000 2000.0 300 49 3500
Feb income 1200 13000 2000.0 300 49 3500
Jan income 1200 13000 2000.0 300 49 3500
Jul income 1200 13000 2000.0 300 49 3500
Jun income 1200 13000 2000.0 300 49 3500
Mar income 1200 13000 2000.0 300 49 3500
May income 1200 13000 2000.0 300 49 3500
Nov income 1200 13000 2000.0 300 49 3500
Oct income 1200 13000 2000.0 300 49 3500
Sep income 1200 13000 2000.0 300 49 3500
详情:
重命名索引级别 1,使 'expense' 变为 'income',然后使用索引的两个级别 groupby
。我们可以使用 first
,但我不认为未来证明和安全,因此,我选择 fillna
零和 sum
。