如何将数据框中的一列拆分为 python 中的多列?

How to split a column in a dataframe into multiple columns in python?

“V”的位置不是固定的。 “V”之前的值不是常量,也可能是 0 值。 “V”也可以是不同的字母。需要代码

ConcatenatedColumn
0.9147V2020-08-042020-06-092019-11-09
3.2F2020-09-112019-05-052020-10-12

Rate    Indicator   Date 1  Date 2  Date 3
0.9147  V   2020-08-04  2020-06-09  2019-11-09
3.2 F   2020-09-11  2019-05-05  2020-10-12

这是我使用 Pandas 和 NumPy 的解决方案:

# Process the dataframe
import pandas as pd
import numpy as np

# Define the columns
rowName = np.arange(0,df.shape[0])
colName = ['Rate','Indicator','Date 1','Date 2','Date 3']

# Create new empty df
df2 = pd.DataFrame(index=rowName, columns=colName)

# Process each concatenated column
for i in range(df.shape[0]):
    # Get concatenated string
    rowText = df.ConcatenatedColumn[i]
    
    # Find rate and indicator
    for j in range(len(rowText)):
        if (rowText[j].isalpha()):           # isalpha checks for any character
            df2.at[i, 'Rate'] = rowText[0:j]
            df2.at[i, 'Indicator'] = rowText[j]
            remStr = rowText[j+1:]
    
    # Find the 3 dates
    lenDate = 10       # Assuming dates are in YYYY-MM-DD format
    df2.at[i, 'Date 1'] = remStr[0:lenDate]
    df2.at[i, 'Date 2'] = remStr[lenDate:(2*lenDate)]
    df2.at[i, 'Date 3'] = remStr[(2*lenDate):]

其中 df 是您的串联列数据:

    ConcatenatedColumn
0   0.9147V2020-08-042020-06-092019-11-09
1   3.2F2020-09-112019-05-052020-10-12

df2 是您的拆分列输出:

    Rate     Indicator  Date 1      Date 2      Date 3
0   0.9147   V          2020-08-04  2020-06-09  2019-11-09
1   3.2      F          2020-09-11  2019-05-05  2020-10-12

但是,请注意,我假设连接的列包含 3 个日期。如果日期数量不同,我的日期代码可以替换为不断更新字符串的 while 循环。