数据框列值映射
Dataframe column value mapping
我是 python 和 pandas 的新手,想使用尽可能多的 pandas 内置功能。
data = { 'source': ['Iowa','New York','San Jose','Houston','Houston' ],
'target' :['New York', 'San Jose', 'Iowa', 'San Jose', 'Arizona']
}
print(np.arange(10).reshape((10,1)) )
data = [['Iowa', 'New York', 1], ['New York' ,'San Jose', 1], ['San Jose' ,'Iowa', 1], ['Houston', 'San Jose', 1], ['Houston' ,'Arizona', 1]]
dataDf = pd.DataFrame(data, columns = ['Source', 'Target', 'value'])
print(dataDf)
# I created unique name list
nameIndex = {'name': ['Iowa', 'New York','San Jose', 'Houston','Arizona' ],
'index': [0,1,2,3,4]}
# Now I want to replace source and target's value(name) with index which is in nameIndex(0,1,2,3,4)
# I have option to go with for loop but wnat to avoid it. Therefore not giving here loop solutions
这里我想用索引替换列 'source' 和 'traget' 中的名称。如何使用数据框功能实现它?
我的预期数据是:
data = { 'source': ['0','1','2','3','3' ],
'target' :['1', '2', '0', '2', '4']
}
您可以将 nameIndex
列表转换为字典并使用 .map
:
nameIndex = {k: v for k, v in zip(nameIndex["name"], nameIndex["index"])}
dataDf["Source"] = dataDf["Source"].map(nameIndex)
dataDf["Target"] = dataDf["Target"].map(nameIndex)
print(dataDf)
打印:
Source Target value
0 0 1 1
1 1 2 1
2 2 0 1
3 3 2 1
4 3 4 1
我是 python 和 pandas 的新手,想使用尽可能多的 pandas 内置功能。
data = { 'source': ['Iowa','New York','San Jose','Houston','Houston' ],
'target' :['New York', 'San Jose', 'Iowa', 'San Jose', 'Arizona']
}
print(np.arange(10).reshape((10,1)) )
data = [['Iowa', 'New York', 1], ['New York' ,'San Jose', 1], ['San Jose' ,'Iowa', 1], ['Houston', 'San Jose', 1], ['Houston' ,'Arizona', 1]]
dataDf = pd.DataFrame(data, columns = ['Source', 'Target', 'value'])
print(dataDf)
# I created unique name list
nameIndex = {'name': ['Iowa', 'New York','San Jose', 'Houston','Arizona' ],
'index': [0,1,2,3,4]}
# Now I want to replace source and target's value(name) with index which is in nameIndex(0,1,2,3,4)
# I have option to go with for loop but wnat to avoid it. Therefore not giving here loop solutions
这里我想用索引替换列 'source' 和 'traget' 中的名称。如何使用数据框功能实现它? 我的预期数据是:
data = { 'source': ['0','1','2','3','3' ],
'target' :['1', '2', '0', '2', '4']
}
您可以将 nameIndex
列表转换为字典并使用 .map
:
nameIndex = {k: v for k, v in zip(nameIndex["name"], nameIndex["index"])}
dataDf["Source"] = dataDf["Source"].map(nameIndex)
dataDf["Target"] = dataDf["Target"].map(nameIndex)
print(dataDf)
打印:
Source Target value
0 0 1 1
1 1 2 1
2 2 0 1
3 3 2 1
4 3 4 1