如何在 groupby 之后将 pandas.core.series.Series 转换回 Dataframe?
How do I convert pandas.core.series.Series back to a Dataframe following a groupby?
我尝试操作一个 Dataframe,当我瞄准另一个 Dataframe 输出时,输出(出乎意料地)是 pandas.core.series.Series 类型。
供参考,原始 Dataframe 如下所示 -
Character Line
0 Leslie Knope Hello.
1 Leslie Knope Hi.
2 Leslie Knope My name is Leslie Knope, and I work for the Pa...
3 Leslie Knope Can I ask you a few questions?
4 Leslie Knope Would you say that you are, "Enjoying yourself...
5 Leslie Knope I'm gonna put a lot of fun.
6 Child Ms. Knope, there's a drunk stuck in the slide.
7 Leslie Knope Sir, this is a children's slide.
8 Leslie Knope You're not allowed to sleep in here.
9 Extra What is?
我希望将所有具有相同字符值的连续行组合起来。因此,从“你好”到“我会带来很多乐趣”的所有 'Leslie Knope' 行将被卷入一行,“儿童”行将保持原样,然后是接下来的两个“Leslie Knope”行将合并为一条。
这是我用来实现(在一定程度上)的代码:
df['key'] = (df['Character'] != df['Character'].shift(1)).astype(int).cumsum()
print(df.head(5))
df2 = df.groupby(['key', 'Character'])['Line'].apply(' '.join)
这是 df2 输出 -
key Character
1 Leslie Knope Hello. Hi. My name is Leslie Knope, and I work...
2 Child Ms. Knope, there's a drunk stuck in the slide.
3 Leslie Knope Sir, this is a children's slide. You're not al...
4 Extra What is?
5 Leslie Knope You know, when I first tell people that I work...
210 Ann Perkins I'm really fired up. You know they say that de...
211 Leslie Knope Soul sista, soul sista Gonna get your phone, s...
212 Ann Perkins Yeah.
213 Leslie Knope Sweet Lady Marmalard
214 Ron Swanson I've created this office as a symbol of how I ...
Name: Line, Length: 214, dtype: object
我希望得到 df2 作为另一个 Dataframe,它可以根据需要折叠角色所说的连续台词,并将这些台词放在台词列中。不太确定这里发生了什么,因为 df2 是 pandas.core.series.Series 类型,所以我很感激以下任一方面的帮助 -
- 另一种折叠角色所说的连续台词的方法
- 一种将 df2 转换为具有键、字符和行列的数据帧的方法。
提前致谢!
您需要做的就是将 .reset_index()
链接到最后一行。
当您将 groupby 函数应用于单个列(行)时,它变成了一个系列。然后其他列成为您系列的索引。
编辑:要删除 'key' 列,只需添加 `.drop('key',axis=1)
我尝试操作一个 Dataframe,当我瞄准另一个 Dataframe 输出时,输出(出乎意料地)是 pandas.core.series.Series 类型。
供参考,原始 Dataframe 如下所示 -
Character Line
0 Leslie Knope Hello.
1 Leslie Knope Hi.
2 Leslie Knope My name is Leslie Knope, and I work for the Pa...
3 Leslie Knope Can I ask you a few questions?
4 Leslie Knope Would you say that you are, "Enjoying yourself...
5 Leslie Knope I'm gonna put a lot of fun.
6 Child Ms. Knope, there's a drunk stuck in the slide.
7 Leslie Knope Sir, this is a children's slide.
8 Leslie Knope You're not allowed to sleep in here.
9 Extra What is?
我希望将所有具有相同字符值的连续行组合起来。因此,从“你好”到“我会带来很多乐趣”的所有 'Leslie Knope' 行将被卷入一行,“儿童”行将保持原样,然后是接下来的两个“Leslie Knope”行将合并为一条。
这是我用来实现(在一定程度上)的代码:
df['key'] = (df['Character'] != df['Character'].shift(1)).astype(int).cumsum()
print(df.head(5))
df2 = df.groupby(['key', 'Character'])['Line'].apply(' '.join)
这是 df2 输出 -
key Character
1 Leslie Knope Hello. Hi. My name is Leslie Knope, and I work...
2 Child Ms. Knope, there's a drunk stuck in the slide.
3 Leslie Knope Sir, this is a children's slide. You're not al...
4 Extra What is?
5 Leslie Knope You know, when I first tell people that I work...
210 Ann Perkins I'm really fired up. You know they say that de...
211 Leslie Knope Soul sista, soul sista Gonna get your phone, s...
212 Ann Perkins Yeah.
213 Leslie Knope Sweet Lady Marmalard
214 Ron Swanson I've created this office as a symbol of how I ...
Name: Line, Length: 214, dtype: object
我希望得到 df2 作为另一个 Dataframe,它可以根据需要折叠角色所说的连续台词,并将这些台词放在台词列中。不太确定这里发生了什么,因为 df2 是 pandas.core.series.Series 类型,所以我很感激以下任一方面的帮助 -
- 另一种折叠角色所说的连续台词的方法
- 一种将 df2 转换为具有键、字符和行列的数据帧的方法。
提前致谢!
您需要做的就是将 .reset_index()
链接到最后一行。
当您将 groupby 函数应用于单个列(行)时,它变成了一个系列。然后其他列成为您系列的索引。
编辑:要删除 'key' 列,只需添加 `.drop('key',axis=1)