改革 pandas 数据框
Reform pandas dataframe
我有一个数据框:
df1 = pandas.DataFrame( {
"text" : ["Alice is in ", "Alice is in wonderland.", "Mallory has done the task.", "Mallory has", "Bob is final." , "Mallory has done"] ,
"label" : ["Seattle", "Portlang", "Gotland", "california", "california", "Portland"] ,
"title":["SA","SA","sometitle","sometitle","some different title","sometitle"],
"version":[1,2,4,1,2,3]})
df1
text label title version
0 Alice is in Seattle SA 1
1 Alice is in wonderland. Portlang SA 2
2 Mallory has done the task. Portland sometitle 4
3 Mallory has california sometitle 1
4 Bob is final. california some different title 2
5 Mallory has done Portland sometitle 3
我想保留与最新版本号相对应的标题和文本,还想将标签保留在列表中。
非常感谢,
将 df.merge
与 Groupby.agg
一起使用:
In [508]: x = df1.groupby(['title']).agg({'version':'max', 'label':list})
In [516]: df1[['title', 'version', 'text']].merge(x, on=['title', 'version'])
Out[516]:
title version text label
0 SA 2 Alice is in wonderland. [Seattle, Portlang]
1 sometitle 4 Mallory has done the task. [Gotland, california, Portland]
2 some different title 2 Bob is final. [california]
我有一个数据框:
df1 = pandas.DataFrame( {
"text" : ["Alice is in ", "Alice is in wonderland.", "Mallory has done the task.", "Mallory has", "Bob is final." , "Mallory has done"] ,
"label" : ["Seattle", "Portlang", "Gotland", "california", "california", "Portland"] ,
"title":["SA","SA","sometitle","sometitle","some different title","sometitle"],
"version":[1,2,4,1,2,3]})
df1
text label title version
0 Alice is in Seattle SA 1
1 Alice is in wonderland. Portlang SA 2
2 Mallory has done the task. Portland sometitle 4
3 Mallory has california sometitle 1
4 Bob is final. california some different title 2
5 Mallory has done Portland sometitle 3
我想保留与最新版本号相对应的标题和文本,还想将标签保留在列表中。
非常感谢,
将 df.merge
与 Groupby.agg
一起使用:
In [508]: x = df1.groupby(['title']).agg({'version':'max', 'label':list})
In [516]: df1[['title', 'version', 'text']].merge(x, on=['title', 'version'])
Out[516]:
title version text label
0 SA 2 Alice is in wonderland. [Seattle, Portlang]
1 sometitle 4 Mallory has done the task. [Gotland, california, Portland]
2 some different title 2 Bob is final. [california]