在 pandas 中将 2 列以逗号分隔的字符串合并为 1 列
Combine 2 columns which are having comma separated strings into 1 column in pandas
我有一个数据框
df = pd.DataFrame([["A","a$b,c$d,k$m","h,y,a"], ["B","n$e,d$w,t$y","t,r,s"]], columns=["id","c1","c2"])
我想将 c1 列中用逗号分隔的每个元素与 c2 列中用星号 (*) 分隔的另一个元素组合起来
预期输出:
df_out = pd.DataFrame([["A","a$b*h,c$d*y,k$m*a"], ["B","n$e*t,d$w*r,t$y*s"]], columns=["id","c3"])
怎么做?
您可以试试下面的代码。
df = pd.DataFrame([["A","a$b,c$d,k$m","h,y,a"], ["B","n$e,d$w,t$y","t,r,s"]], columns=["id","c1","c2"])
def combine_list(a, b):
return (',').join([i+'*'+j for i, j in zip(a, b)])
df['c3'] = df.apply(lambda x: combine_list(x['c1'].split(','), x['c2'].split(',')), axis=1)
df_out = df[["id", "c3"]]
希望这能解决您的问题!
使用嵌套列表理解 DataFrame.pop
提取值,zip
添加 *
使用 f-string
s 最后加入 join
:
df['c3'] = [','.join(f'{i}*{j}' for i, j in zip(x.split(','), y.split(',')))
for x, y in zip(df.pop('c1'), df.pop('c2'))]
print (df)
id c3
0 A a$b*h,c$d*y,k$m*a
1 B n$e*t,d$w*r,t$y*s
我有一个数据框
df = pd.DataFrame([["A","a$b,c$d,k$m","h,y,a"], ["B","n$e,d$w,t$y","t,r,s"]], columns=["id","c1","c2"])
我想将 c1 列中用逗号分隔的每个元素与 c2 列中用星号 (*) 分隔的另一个元素组合起来
预期输出:
df_out = pd.DataFrame([["A","a$b*h,c$d*y,k$m*a"], ["B","n$e*t,d$w*r,t$y*s"]], columns=["id","c3"])
怎么做?
您可以试试下面的代码。
df = pd.DataFrame([["A","a$b,c$d,k$m","h,y,a"], ["B","n$e,d$w,t$y","t,r,s"]], columns=["id","c1","c2"])
def combine_list(a, b):
return (',').join([i+'*'+j for i, j in zip(a, b)])
df['c3'] = df.apply(lambda x: combine_list(x['c1'].split(','), x['c2'].split(',')), axis=1)
df_out = df[["id", "c3"]]
希望这能解决您的问题!
使用嵌套列表理解 DataFrame.pop
提取值,zip
添加 *
使用 f-string
s 最后加入 join
:
df['c3'] = [','.join(f'{i}*{j}' for i, j in zip(x.split(','), y.split(',')))
for x, y in zip(df.pop('c1'), df.pop('c2'))]
print (df)
id c3
0 A a$b*h,c$d*y,k$m*a
1 B n$e*t,d$w*r,t$y*s