python 将多个返回值分成单独行的语法

Question

我正在使用 google 搜索方法来获得五只动物的前 5 links，我想为每只动物制作一个数据框（每只动物有五个 link）。一只动物（熊猫）的数据框基本上需要如下图所示。有五行，col 1 是 panda，col 2 是 ONE link

但是现在它看起来像这样（如下所示）只有一行，第 1 列熊猫，第 2 列所有五个 link 都在一个单元格中

如何才能让我的代码生成一个数据框，将五个 link 分隔到单独一行的单元格中，如图 #1 所示？是否有 python 语法？（我想通过 for 循环运行代码，但我得到一个 AttributeError。代码应该适用于动物列表，为每个动物创建单独的数据帧，Panda 只是动物数据帧之一的示例应该看起来像）。

Answer 1

为每个动物创建新的 DF

您可以拆分和分解数据框。然后使用 groupby 为每个 Animal 创建一个单独的数据框。方法如下。

import pandas as pd
df = pd.DataFrame({'Animal':['Panda', 'Tiger','Monkey'],
                   'Link':['abcde.com, fghijk.com, lmnopq.com, rstuvw.com, xyz.com',
                           'adobe.com, facebook.com, linkedin.com, google.com, citi.com',
                           'amazon.com, bbc.com, cnn.com, fox.com, abc.com'],})

#Convert all the data into multiple rows
df = (df.set_index(['Animal'])
   .apply(lambda x: x.str.split(',').explode())
   .reset_index()) 

#create a dictionary of pandas dataframe for each animal
d = dict(tuple(df.groupby('Animal')))

#store the dataframes into a list
dfx = []

#Iterate through each key in the dictonary, and append to list
for k in d:
    dfx.append(d[k])

#example
print (type(dfx[1])) #will result in <class 'pandas.core.frame.DataFrame'>

print (dfx[0]) #will print dataframe for Animal = 'Monkey'

print (dfx[1]) #will print dataframe for Animal = 'Panda'

print (dfx[2]) #will print dataframe for Animal = 'Tiger'

这个输出将是：

列表dfx中每个DataFrame的类型是：

<class 'pandas.core.frame.DataFrame'>

dfx[0] 会给你：

    Animal        Link
10  Monkey  amazon.com
11  Monkey     bbc.com
12  Monkey     cnn.com
13  Monkey     fox.com
14  Monkey     abc.com

dfx[1] 会给你：

  Animal         Link
0  Panda    abcde.com
1  Panda   fghijk.com
2  Panda   lmnopq.com
3  Panda   rstuvw.com
4  Panda      xyz.com

dfx[2] 会给你：

  Animal           Link
5  Tiger      adobe.com
6  Tiger   facebook.com
7  Tiger   linkedin.com
8  Tiger     google.com
9  Tiger       citi.com

请注意，groupby 将使用字母顺序，因此 Monkey，然后是 Panda，然后是 Tiger

拆分和分解的上一个解决方案

下面是我将如何做的。

import pandas as pd
df = pd.DataFrame({'Animal':['Panda'],
                   'Link':['abcde.com, fghijk.com, lmnopq.com, rstuvw.com, xyz.com']})
print (df)
df = (df.set_index(['Animal'])
   .apply(lambda x: x.str.split(',').explode())
   .reset_index()) 
print (df)

原始数据帧：

  Animal                                                    Link
0  Panda  abcde.com, fghijk.com, lmnopq.com, rstuvw.com, xyz.com

更新数据框：

  Animal         Link
0  Panda    abcde.com
1  Panda   fghijk.com
2  Panda   lmnopq.com
3  Panda   rstuvw.com
4  Panda      xyz.com

我没有更改任何代码。这是具有多个记录的解决方案。

import pandas as pd
df = pd.DataFrame({'Animal':['Panda', 'Tiger','Monkey'],
                   'Link':['abcde.com, fghijk.com, lmnopq.com, rstuvw.com, xyz.com',
                           'adobe.com, facebook.com, linkedin.com, google.com, citi.com',
                           'amazon.com, bbc.com, cnn.com, fox.com, abc.com'],})
print (df)
df = (df.set_index(['Animal'])
   .apply(lambda x: x.str.split(',').explode())
   .reset_index()) 
print (df)

之前：

Animal                                                            Link
0   Panda        abcde.com, fghijk.com, lmnopq.com, rstuvw.com,xyz.com
1   Tiger  adobe.com, facebook.com, linkedin.com, google.com, citi.com
2  Monkey               amazon.com, bbc.com, cnn.com, fox.com, abc.com

之后：

    Animal           Link
0    Panda      abcde.com
1    Panda     fghijk.com
2    Panda     lmnopq.com
3    Panda     rstuvw.com
4    Panda        xyz.com
5    Tiger      adobe.com
6    Tiger   facebook.com
7    Tiger   linkedin.com
8    Tiger     google.com
9    Tiger       citi.com
10  Monkey     amazon.com
11  Monkey        bbc.com
12  Monkey        cnn.com
13  Monkey        fox.com
14  Monkey        abc.com

Answer 2

如果每个 Link 单元格包含一个用逗号分隔的字符串，您可以这样做：

df = pd.DataFrame({'Animal':['Panda'], 'Link':['google.com, apple.com, amazon.com']})
print(f'Before:\n{df}\n')

df['Link'] = df['Link'].apply(lambda x: x.split(','))
df = df.explode('Link')
print(f'After:\n{df}')


#output:
Before:
  Animal                               Link
0  Panda  google.com, apple.com, amazon.com

After:
  Animal         Link
0  Panda   google.com
0  Panda    apple.com
0  Panda   amazon.com

或者，如果每个单元格都包含一个链接列表，您可以删除上面的拆分语句并展开。

python 将多个返回值分成单独行的语法

python syntax that divides multiple returned values into separate rows

python

dataframe

pandas

data-science

为每个动物创建新的 DF

拆分和分解的上一个解决方案