从 pandas python 中的列表 df 中创建一个 df

Question

下面是我在 pandas ipython 中的 df。我想计算每个列表中的对象并将结果计数放入 df.['sponsor_id', 'list_count_int']

     sponsor_id
7       [s2474-112, s1543-112, s1262-112, s3676-112, s...
11      [s130-110, s169-110, s589-110, s134-110, s3062...
66      [s918-112, s946-112, s3326-112, s2007-112, s33...
116     [s79-112, s1302-112, s3304-112, s175-112, s76-...
136     [s1619-112, s2475-112, s2507-112, s328-112, s2...
.
.
.

下面是我的代码。我正在尝试使用 for 循环。

import pandas as pd


df = pd.concat((pd.read_csv(f, names=['date','bill_id','sponsor_id']) for f in glob.glob('/home/jayaramdas/anaconda3/df/s11?_s_b')))

df.groupby('sponsor_id').apply(lambda x: list(x['bill_id']))



#this is the code for my for loop
df_new = df['sponsor_id'].astype('list').map(lambda x: sum(y for y in ['sponsor_id']))

我收到一条很长的错误消息。这是它的结尾：

/home/jayaramdas/anaconda3/lib/python3.5/site-packages/pandas/core/internals.py in _astype(self, dtype, copy, raise_on_error, values, klass, mgr, **kwargs)
    443 
    444         # astype processing
--> 445         dtype = np.dtype(dtype)
    446         if self.dtype == dtype:
    447             if copy:

TypeError: data type "list" not understood

Answer 1

我认为您在 sponsor_id 列中有 int 个值。因此，您可以 apply len 仅针对 list 类型的值。其他值 (int) 设置为 1:

print df
                                     sponsor_id
0  [s2474-112, s1543-112, s1262-112, s3676-112]
1                          [s130-110, s169-110]
2                                           102

df['count'] = df['sponsor_id'].apply(lambda x: len(x) if isinstance(x, list) else 1) 
print df
                                     sponsor_id  count
0  [s2474-112, s1543-112, s1262-112, s3676-112]      4
1                          [s130-110, s169-110]      2
2                                           102      1

从 pandas python 中的列表 df 中创建一个 df

make a df out of a df of lists in pandas python

python

ipython

pandas