按条件添加列

add columns by conditional

我有以下数据框:

import pandas as pd

data = {"A1": [22,25,27,35],
        "A2": [23,54,71,52],
        "A3": [21,27,26,31],
        "A1_L1": [21,27,26,31],
        "A1_L2": [21,27,26,31],
        "A2_L1": [21,27,26,31],
        "A3_L1": [21,27,26,31],
        "A3_L2": [21,27,26,31],
        "A3_L3": [21,27,26,31]
        }
df = pd.DataFrame(data, columns = ["A1", "A2","A3","A1_L1","A1_L2","A2_L1","A3_L1","A3_L2","A3_L3"])

我想为每一列自动添加一个新列,其中包含一个 L。此栏的信息将是减法栏 A - A_L

例如,

A1_L1_new=A1-A1_L1 
A1_L2_new=A1-A1_L2  
A2_L1_new=A2-A2_L1
A3_L1_new=A3-A3_L1
A3_L2_new=A3-A3_L2
A3_L3_new=A3-A3_L3

任何帮助将不胜感激。

首先,导入库正则表达式:

import re

然后,将列名作为列表。

col_names = df.columns.tolist()
  1. 对于每个列名,检查其中是否有L
  2. 如果存在,用正则表达式分隔 AL.
  3. 然后根据这些AL创建字符串。
  4. 最后创建我们刚刚创建的字符串名称的列,减去对应的AL的列。
for i in col_names:
    if 'L' in i: #checks if name contains 'L'
        x = [x.group() for x in re.finditer(r'[A,L][1-9]', i)] # place in list, the 'A' and 'L'    
        foo = str(x[0]) + "-" + str(x[1]) + '_new' # Create new name, based on 'A' and 'L'
        df[foo] = df[x[0]] - df[i] # Subtract the corresponding columns.

df

如果您想知道正则表达式的作用:

  • [A,L] 查找以 AL,
  • 开头的匹配项
  • [1-9] 匹配任何数字。 (您也可以改为 \d,匹配任何数字)