这是使用条件将行从一个数据框移动到另一个数据框的正确方法吗?

Is this the right way to move rows from one dataframe to another with a condtion?

当 df1 和 df2 中的卡路里相同时,我想将一些行从 df1 移动到 df2。两个dfs有相同的列。

import numpy as np
import pandas as pd

np.random.seed(0)
df1 = pd.DataFrame(data = {
  "calories": [420, 80, 90, 10],
  "duration": [50, 4, 5, 3]
})
df2 = pd.DataFrame(data = {
  "calories": [420, 380, 390],
  "duration": [60, 40, 45]
})

print(df1)
print(df2)



calories  duration
0       420        50
1        80         4
2        90         5
3        10         2
   calories  duration
0       420        60
1       380        40
2       390        45

rows = df1.loc[df1.calories == df2.calories, :]
df2 = df2.append(rows, ignore_index=True)
df1.drop(rows.index, inplace=True)

print('df1:')
print(df1)
print('df2:')
print(df2)

然后报这个错:

raise ValueError("Can only compare identically-labeled Series objects")
ValueError: Can only compare identically-labeled Series objects

编辑:解决方案

import numpy as np
import pandas as pd

np.random.seed(0)
df1 = pd.DataFrame(data = {
  "mid": [420, 380, 90, 420],
  "A": [50, 4, 5, 3],
   "B": [420, 4, 5, 3]
})
df2 = pd.DataFrame(data = {
  "mid": [420, 380, 390],
  "A": [60, 40, 80],
   "B": [150, 24, 25]
})

print('df1:')
print(df1)
print('df2:')
print(df2)
new_df1 = df1[~df1.mid.isin(df2.mid)]

dup_df1 = df1[df1.mid.isin(df2.mid)]
new_df2 = df2.append(dup_df1, ignore_index=True)

print('dup:')
print(dup_df1)
print('df1:')
print(new_df1)
print('df2:')
print(new_df2)

由于你的数据帧长度不一样,你需要使用merge to find rows with common calories values. You need to merge on the index and calories values; that can most easily be achieved by using reset_index临时添加一个index列来合并:

dftemp = df1.reset_index().merge(df2.reset_index(), on=['index', 'calories'], suffixes=['', '_y'])

输出:

   index  calories  duration  duration_y
0      0       420        50          60

您现在可以 concatdftempdf2caloriesduration 值(再次使用 reset_index 重置索引):

df2 = pd.concat([df2, dftemp[['calories', 'duration']]]).reset_index(drop=True)

输出(对于您的示例数据):

   calories  duration
0       420        60
1       380        40
2       390        45
3       420        50

要删除从 df1 复制到 df2 的行,我们仅在索引上合并,然后过滤掉两个 calories 值不同的行:

dftemp = df1.merge(df2, left_index=True, right_index=True, suffixes=['', '_y']).query('calories != calories_y')
df1 = dftemp[['calories', 'duration']].reset_index(drop=True)

输出(对于您的示例数据):

   calories  duration
0        80         4
1        90         5
2        10         3