自加入 Pandas:在同一 table 中合并/加入
Self Join in Pandas: Merge / Join within the same table
我把数据附在这里了。
Excel Data
我需要 return 一个包含所有员工列表(EmployeeID、名字、中间名、姓氏)及其经理的名字和姓氏的 DataFrame。输出 DataFrame 中的列应为:EmployeeID、FirstName、MiddleName、LastName、ManagerFirstName、ManagerLastName。
提示:考虑单独加入 table,因为管理者本身就是雇员。
这是我目前的代码,它给我重复的记录:
# Creating data frame from Excel File. Enter the appropriate file path
df = pd.read_excel(Employees)
df_new = df[['EmployeeID', 'ManagerID', 'FirstName', 'MiddleName', 'LastName']].copy()
df_new['ManagerID'] = pd.to_numeric(df_new['ManagerID'], errors='coerce').fillna(0)
# convert object to int64
df_new['ManagerID'] = df_new['ManagerID'].astype(np.int64)
result = df_new.merge(df_new, left_on='EmployeeID', right_on='ManagerID')
print(result.head())
如有任何帮助,我们将不胜感激。
我认为这行得通
df = pd.DataFrame({"EmployeeID":[259,278,204,78,255],
"ManagerID":[278,204,78,255,259],
"FirstName":["ben","garret","gabe","reuben","gordon"],
"MiddleName":["T","R","B","H","L"],
"LastName":["miller","vargas","mares","dsa","hee"]})
df['ManagerID'] = pd.to_numeric(df['ManagerID'], errors='coerce').fillna(0)
df_ = df[["EmployeeID","FirstName","LastName"]]
df_ = df_.rename(columns={"EmployeeID":"ManagerID","FirstName":"ManagerFirstName","LastName":"ManagerLastName"})
out = pd.merge(df,df_,on=["ManagerID"],how="left")
out = out.drop(["ManagerID"],axis=1)
我把数据附在这里了。
Excel Data
我需要 return 一个包含所有员工列表(EmployeeID、名字、中间名、姓氏)及其经理的名字和姓氏的 DataFrame。输出 DataFrame 中的列应为:EmployeeID、FirstName、MiddleName、LastName、ManagerFirstName、ManagerLastName。
提示:考虑单独加入 table,因为管理者本身就是雇员。
这是我目前的代码,它给我重复的记录:
# Creating data frame from Excel File. Enter the appropriate file path
df = pd.read_excel(Employees)
df_new = df[['EmployeeID', 'ManagerID', 'FirstName', 'MiddleName', 'LastName']].copy()
df_new['ManagerID'] = pd.to_numeric(df_new['ManagerID'], errors='coerce').fillna(0)
# convert object to int64
df_new['ManagerID'] = df_new['ManagerID'].astype(np.int64)
result = df_new.merge(df_new, left_on='EmployeeID', right_on='ManagerID')
print(result.head())
如有任何帮助,我们将不胜感激。
我认为这行得通
df = pd.DataFrame({"EmployeeID":[259,278,204,78,255],
"ManagerID":[278,204,78,255,259],
"FirstName":["ben","garret","gabe","reuben","gordon"],
"MiddleName":["T","R","B","H","L"],
"LastName":["miller","vargas","mares","dsa","hee"]})
df['ManagerID'] = pd.to_numeric(df['ManagerID'], errors='coerce').fillna(0)
df_ = df[["EmployeeID","FirstName","LastName"]]
df_ = df_.rename(columns={"EmployeeID":"ManagerID","FirstName":"ManagerFirstName","LastName":"ManagerLastName"})
out = pd.merge(df,df_,on=["ManagerID"],how="left")
out = out.drop(["ManagerID"],axis=1)