在 pandas 中将 2 列以逗号分隔的字符串合并为 1 列

Combine 2 columns which are having comma separated strings into 1 column in pandas

我有一个数据框

df = pd.DataFrame([["A","a$b,c$d,k$m","h,y,a"], ["B","n$e,d$w,t$y","t,r,s"]], columns=["id","c1","c2"])

我想将 c1 列中用逗号分隔的每个元素与 c2 列中用星号 (*) 分隔的另一个元素组合起来

预期输出:

df_out = pd.DataFrame([["A","a$b*h,c$d*y,k$m*a"], ["B","n$e*t,d$w*r,t$y*s"]], columns=["id","c3"])

怎么做?

您可以试试下面的代码。

df = pd.DataFrame([["A","a$b,c$d,k$m","h,y,a"], ["B","n$e,d$w,t$y","t,r,s"]], columns=["id","c1","c2"])

def combine_list(a, b):
          return (',').join([i+'*'+j for i, j in zip(a, b)])
        
df['c3'] = df.apply(lambda x: combine_list(x['c1'].split(','), x['c2'].split(',')), axis=1)
df_out = df[["id", "c3"]]

希望这能解决您的问题!

使用嵌套列表理解 DataFrame.pop 提取值,zip 添加 * 使用 f-strings 最后加入 join

df['c3'] = [','.join(f'{i}*{j}' for i, j in zip(x.split(','), y.split(',')))
                                for x, y in zip(df.pop('c1'), df.pop('c2'))]
print (df)
  id                 c3
0  A  a$b*h,c$d*y,k$m*a
1  B  n$e*t,d$w*r,t$y*s