如何根据最新的 month_year 列将 table_one 与 table_two 左合并？

Question

table_one 具有不同的 ID。我想在 table_two 左合并 table_one 基于 month_year 的最新日期] 列。我做了以下但没有用。

import datetime
import pandas as pd

today = datetime.date.today()
first = today.replace(day=1)
lastMonth = first-datetime.timedelta(days=1)
latest_moyr =str(lastMonth.month)+ '_' + str(lastMonth.year)

final_df = pd.merge(left = table_one, right = table_two.loc[table_two['month_year']== latest_moyr], left_on = 'ID', right_on = 'ID', how = 'left')

table_one

ID	Yrs
1001	10
1002	2
1003	5

table_two

ID	sum3	month_year
1001	24.50	2_2013
1002	2.05	4_2013
1003	90.36	5_2013
1001	100	8_2013
1002	122	12_2014
1001	245	9_2018
1003	10.50	7_2011
1002	212	4_2018
1005	5.01	3_2014

我想变成这样

ID	Yrs	sum3	month_year
1001	10	245	9_2018
1002	2	212	4_2018
1003	5	90.36	5_2013

Answer 1

您必须在合并前仅保留每个 ID 的最后日期。为此，请将您的 month_year 列转换为适当的 datetime64，然后按日期排序并删除每个 ID 的重复项。

df3 = df1.merge(
        df2.assign(dt=pd.to_datetime(df2['month_year'], format='%m_%Y'))
           .sort_values('dt').drop_duplicates('ID', keep='last').drop(columns='dt'),
        on='ID', how='left')

输出：

>>> df3
     ID  Yrs    sum3 month_year
0  1001   10  245.00     9_2018
1  1002    2  212.00     4_2018
2  1003    5   90.36     5_2013

如何根据最新的 month_year 列将 table_one 与 table_two 左合并？

How to left merge table_one with table_two based on latest month_year column?

python

string

datetime

machine-learning

pandas