如何对具有替代 ID 的数据进行某些更改?

how to do certain changes with data having alternative ids?

我正在努力旋转和重塑一些数据。我有如下所示的数据。

nickname:Nick 加文
nickname:Nick 职位:教师
nickname:Nick 职责:teaching_math
nickname:Bob 马库斯
nickname:Bob 工作:音乐家
nickname:Bob 职责:plays_piano

我想改成:

尼克老师 teaching_math
加文老师 teaching_math
鲍勃音乐家 plays_piano
马库斯音乐家 plays_piano

非常感谢任何帮助!

#get the names, remove the nickname appendage
df[0] = df[0].str.split(':').str[-1]

#create temp column to get nicknames into another column
df['temp'] = np.where(~df[1].str.contains('[:]'),df[0],np.nan)

#extract words after the ':'
df[1] = df[1].str.lstrip('job:').str.lstrip('duties:').str.strip()

#fillna to the side so each name has job and duties beneath
df = df.ffill(axis=1)

#group by col 0
#combine words 
#stack
#split into separate columns
#and drop index 0
final = (df
         .groupby(0)
         .agg(lambda x: x.str.cat(sep=','))
         .stack()
         .str.split(',', expand = True)
         .reset_index(drop=[0]))

决赛

    0          1           2
0   Marcus  Musician    plays_piano
1   Bob     Musician    plays_piano
2   Gavin   Teacher     teaching_math
3   Nick    Teacher     teaching_math

试试下面的代码。

dicts = {}
for i in open('your_data.txt'):
    split_i = i.split('   ')
    if split_i[0].split(':')[1] not in dicts:
        dicts[split_i[0].split(':')[1]] = [split_i[1].rstrip()]
    else:
        dicts[split_i[0].split(':')[1]].append(split_i[1].replace('job: ', '').replace('duties:', '').strip())
for k, v in dicts.iteritems():
    print k, v