如何在 Pandas Python 中按 id 对行进行排名
How to rank rows by id in Pandas Python
我有一个这样的数据框:
id points1 points2
1 44 53
1 76 34
1 63 66
2 23 34
2 44 56
我想要这样的输出:
id points1 points2 points1_rank points2_rank
1 44 53 3 2
1 76 34 1 3
1 63 66 2 1
2 23 79 2 1
2 44 56 1 2
基本上,我想 groupby('id')
,并找到具有相同 ID 的每一列的排名。
我试过这个:
features = ["points1","points2"]
df = pd.merge(df, df.groupby('id')[features].rank().reset_index(), suffixes=["", "_rank"], how='left', on=['id'])
但是我得到了keyerror 'id'
使用join
with remove reset_index
and for change columns names add add_suffix
:
features = ["points1","points2"]
df = df.join(df.groupby('id')[features].rank(ascending=False).add_suffix('_rank').astype(int))
print (df)
id points1 points2 points1_rank points2_rank
0 1 44 53 3 2
1 1 76 34 1 3
2 1 63 66 2 1
3 2 23 34 2 2
4 2 44 56 1 1
你需要在rank
里面使用ascending=False
df.join(df.groupby('id')['points1', 'points2'].rank(ascending=False).astype(int).add_suffix('_rank'))
+---+----+---------+---------+--------------+--------------+
| | id | points1 | points2 | points1_rank | points2_rank |
+---+----+---------+---------+--------------+--------------+
| 0 | 1 | 44 | 53 | 3 | 2 |
| 1 | 1 | 76 | 34 | 1 | 3 |
| 2 | 1 | 63 | 66 | 2 | 1 |
| 3 | 2 | 23 | 34 | 2 | 2 |
| 4 | 2 | 44 | 56 | 1 | 1 |
+---+----+---------+---------+--------------+--------------+
我有一个这样的数据框:
id points1 points2
1 44 53
1 76 34
1 63 66
2 23 34
2 44 56
我想要这样的输出:
id points1 points2 points1_rank points2_rank
1 44 53 3 2
1 76 34 1 3
1 63 66 2 1
2 23 79 2 1
2 44 56 1 2
基本上,我想 groupby('id')
,并找到具有相同 ID 的每一列的排名。
我试过这个:
features = ["points1","points2"]
df = pd.merge(df, df.groupby('id')[features].rank().reset_index(), suffixes=["", "_rank"], how='left', on=['id'])
但是我得到了keyerror 'id'
使用join
with remove reset_index
and for change columns names add add_suffix
:
features = ["points1","points2"]
df = df.join(df.groupby('id')[features].rank(ascending=False).add_suffix('_rank').astype(int))
print (df)
id points1 points2 points1_rank points2_rank
0 1 44 53 3 2
1 1 76 34 1 3
2 1 63 66 2 1
3 2 23 34 2 2
4 2 44 56 1 1
你需要在rank
ascending=False
df.join(df.groupby('id')['points1', 'points2'].rank(ascending=False).astype(int).add_suffix('_rank'))
+---+----+---------+---------+--------------+--------------+
| | id | points1 | points2 | points1_rank | points2_rank |
+---+----+---------+---------+--------------+--------------+
| 0 | 1 | 44 | 53 | 3 | 2 |
| 1 | 1 | 76 | 34 | 1 | 3 |
| 2 | 1 | 63 | 66 | 2 | 1 |
| 3 | 2 | 23 | 34 | 2 | 2 |
| 4 | 2 | 44 | 56 | 1 | 1 |
+---+----+---------+---------+--------------+--------------+