Pandas:基于其他列创建新列的有效方法。多到少
Pandas: Efficient way of creating new column based on other column. Many to few
例如:
df = pd.DataFrame( {"A": [7001, 8001, 9001]} )
equiv = {1:[7001, 8001], 2: [9001]}
df["B"] = df["A"].map(equiv.get)
以
结束
a, b
7001, 1
8001, 1
9001, 2
我在想:
df = pd.DataFrame( {"A": [7001, 8001, 9001]} )
equiv = {1:[7001, 8001], 2: 9001}
df["B"] = df["A"].map(equiv.get)
我不想使用 equiv = {7001:1, 8001:1, 9001:2}
,因为在我的实际数据集中,我将有许多字符串值映射到 1
和 2
您可以使用 inverted dictionary
然后 map
它:
import pandas as pd
df = pd.DataFrame( {"A": [7001, 8001, 9001]} )
print df
A
0 7001
1 8001
2 9001
equiv = {1:[7001, 8001], 2: [9001]}
d = dict( (v,k) for k in equiv for v in equiv[k] )
print d
{7001: 1, 9001: 2, 8001: 1}
df["B"] = df["A"].map(d)
print df
A B
0 7001 1
1 8001 1
2 9001 2
例如:
df = pd.DataFrame( {"A": [7001, 8001, 9001]} )
equiv = {1:[7001, 8001], 2: [9001]}
df["B"] = df["A"].map(equiv.get)
以
结束a, b
7001, 1
8001, 1
9001, 2
我在想:
df = pd.DataFrame( {"A": [7001, 8001, 9001]} )
equiv = {1:[7001, 8001], 2: 9001}
df["B"] = df["A"].map(equiv.get)
我不想使用 equiv = {7001:1, 8001:1, 9001:2}
,因为在我的实际数据集中,我将有许多字符串值映射到 1
和 2
您可以使用 inverted dictionary
然后 map
它:
import pandas as pd
df = pd.DataFrame( {"A": [7001, 8001, 9001]} )
print df
A
0 7001
1 8001
2 9001
equiv = {1:[7001, 8001], 2: [9001]}
d = dict( (v,k) for k in equiv for v in equiv[k] )
print d
{7001: 1, 9001: 2, 8001: 1}
df["B"] = df["A"].map(d)
print df
A B
0 7001 1
1 8001 1
2 9001 2