当行已存在两列时删除行

Question

我有问题。我想删除 customerId 和 fromDate 具有相同值的所有行。例如。 1 行和 4 行相同。所以行 4 应该被删除。但是我怎样才能找到相同的行呢？

数据框

   customerId    fromDate
0           1  2021-02-22
1           1  2021-03-18
2           1  2021-03-22
3           1        None
4           1  2021-03-18
5           3  2021-02-22
6           3  2021-02-22

代码

import pandas as pd


d = {'customerId': [1, 1, 1, 1, 1, 3, 3],
     'fromDate': ['2021-02-22', '2021-03-18', '2021-03-22', None, '2021-03-18', '2021-02-22', '2021-02-22']
    }
df = pd.DataFrame(data=d)
print(df)

我想要的

   customerId    fromDate
0           1  2021-02-22
1           1  2021-03-18
2           1  2021-03-22
3           1        None
5           3  2021-02-22

# Removed
# 4           1  2021-03-18
# 6           3  2021-02-22

Answer 1

IIUC 您可以使用 drop_duplicates 删除重复项

df.drop_duplicates(inplace = True)

Answer 2

您可以使用：

df.drop_duplicates()

删除所有重复的行 https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop_duplicates.html

当行已存在两列时删除行

Remove row when the row already exist by two columns

python

dataframe

pandas