数据框列值映射

Dataframe column value mapping

我是 python 和 pandas 的新手,想使用尽可能多的 pandas 内置功能。

data = { 'source': ['Iowa','New York','San Jose','Houston','Houston' ],
         'target' :['New York', 'San Jose', 'Iowa', 'San Jose', 'Arizona']
        }
print(np.arange(10).reshape((10,1)) )
data = [['Iowa', 'New York', 1], ['New York' ,'San Jose', 1], ['San Jose' ,'Iowa', 1], ['Houston', 'San Jose', 1], ['Houston' ,'Arizona', 1]]
dataDf = pd.DataFrame(data, columns = ['Source', 'Target', 'value'])
print(dataDf)

# I created unique name list
nameIndex = {'name': ['Iowa', 'New York','San Jose', 'Houston','Arizona' ],
        'index': [0,1,2,3,4]}

# Now I want to replace source and target's value(name) with index which is in nameIndex(0,1,2,3,4)
# I have option to go with for loop but wnat to avoid it. Therefore not giving here loop solutions

这里我想用索引替换列 'source' 和 'traget' 中的名称。如何使用数据框功能实现它? 我的预期数据是:

data = { 'source': ['0','1','2','3','3' ],
         'target' :['1', '2', '0', '2', '4']
        }

您可以将 nameIndex 列表转换为字典并使用 .map:

nameIndex = {k: v for k, v in zip(nameIndex["name"], nameIndex["index"])}
dataDf["Source"] = dataDf["Source"].map(nameIndex)
dataDf["Target"] = dataDf["Target"].map(nameIndex)
print(dataDf)

打印:

   Source  Target  value
0       0       1      1
1       1       2      1
2       2       0      1
3       3       2      1
4       3       4      1