根据特定范围的 pandas 列中的顺序替换数字

Question

计算器：

我必须按顺序重新排列 pandas 整数列，（原始列的唯一值计数为 50，因此必须在 1 到 50 的范围内完成） .此栏重复，必须保留。

例如，如果我有：

geodf['Areas']
0     38
1     44
2     68
3     63
4     63
5     63
6     63
7     44
8     44
9      7
10    63
11    63
12    63
13    39
14    44

对于所有的专栏，我怎样才能把它变成这个？（它的大小>200）

geodf['Areas']
0     2
1     4
2     6
3     5
4     5
5     5
6     5
7     4
8     4
9     1
10    5
11    5
12    5
13    3
14    4

可以看出，值的替换取决于前一个数字与其他数字相比的顺序。有办法实现吗？

Answer 1

您可以使用.rank()方法'dense'，如下：

geodf['Areas'].rank(method='dense').astype(int)

结果：

0     2
1     4
2     6
3     5
4     5
5     5
6     5
7     4
8     4
9     1
10    5
11    5
12    5
13    3
14    4
Name: Areas, dtype: int32

如果您想将逻辑应用于 geodf 中的所有列，您可以尝试类似的操作：

for col in geodf:    # or replace 'geodf' in this line by a list of the selected columns
    geodf[col] = geodf[col].rank(method='dense').astype(int)

Replacing numbers based in their order in a pandas column with a specific range