Pandas:根据字典中存在的值保留列值,其他列留空
Pandas: Retain the column value based on the value present in dictionary and make other columns as blank
我有一个数据框
df = pd.DataFrame([["A","X",98,56,1,2,3,4], ["B","Z",79,54,36,3,4,8], ["C","Y",98,56,2,5,6,7],["A","Y",79,54,36,12,13,24], ["B","X",98,56,3,6,7,8], ["C","Z",48,51,85,5,6,5]], columns=["id","key","c1","c2","c3","c4","C5","C6"])
我有一本字典
dic = {"X":['c1','c3'],"Y":['c2','c4'],"Z":['c5','c6']}
基于 df 的键列,select 使用字典 dic 的列,仅保留这些列中的行值,并将其他行值设为空白。
例如:对于 df 的键 X,将 C1 和 C3 中的值保留为字典中的值,并将其他列留空。
预期输出:
df_out = pd.DataFrame([["A","X",98,"",1,"","",""], ["B","Z","","","","",4,8], ["C","Y","",56,"",5,"",""],["A","Y","",54,"",12,"",""], ["B","X",98,"",3,"","",""], ["C","Z","","","","",6,5]], columns=["id","key","c1","c2","c3","c4","C5","C6"])
怎么做?
使用Index.difference
for not matched columns and set empty strings in DataFrame.loc
:
dic = {"X":['c1','c3'],"Y":['c2','c4'],"Z":['C5','C6']}
for k, v in dic.items():
df.loc[df.key == k, df.columns.difference(v + ['id', 'key'])] = ''
print (df)
id key c1 c2 c3 c4 C5 C6
0 A X 98 1
1 B Z 4 8
2 C Y 56 5
3 A Y 54 12
4 B X 98 3
5 C Z 6 5
我有一个数据框
df = pd.DataFrame([["A","X",98,56,1,2,3,4], ["B","Z",79,54,36,3,4,8], ["C","Y",98,56,2,5,6,7],["A","Y",79,54,36,12,13,24], ["B","X",98,56,3,6,7,8], ["C","Z",48,51,85,5,6,5]], columns=["id","key","c1","c2","c3","c4","C5","C6"])
我有一本字典
dic = {"X":['c1','c3'],"Y":['c2','c4'],"Z":['c5','c6']}
基于 df 的键列,select 使用字典 dic 的列,仅保留这些列中的行值,并将其他行值设为空白。
例如:对于 df 的键 X,将 C1 和 C3 中的值保留为字典中的值,并将其他列留空。
预期输出:
df_out = pd.DataFrame([["A","X",98,"",1,"","",""], ["B","Z","","","","",4,8], ["C","Y","",56,"",5,"",""],["A","Y","",54,"",12,"",""], ["B","X",98,"",3,"","",""], ["C","Z","","","","",6,5]], columns=["id","key","c1","c2","c3","c4","C5","C6"])
怎么做?
使用Index.difference
for not matched columns and set empty strings in DataFrame.loc
:
dic = {"X":['c1','c3'],"Y":['c2','c4'],"Z":['C5','C6']}
for k, v in dic.items():
df.loc[df.key == k, df.columns.difference(v + ['id', 'key'])] = ''
print (df)
id key c1 c2 c3 c4 C5 C6
0 A X 98 1
1 B Z 4 8
2 C Y 56 5
3 A Y 54 12
4 B X 98 3
5 C Z 6 5