List comprehension with dataframe condition; ValueError: Item wrong length

List comprehension with dataframe condition; ValueError: Item wrong length

我正在尝试使用列表理解来创建一个 DataFrame 列表,其中我附加的项目是 DataFrame[condition = True]。但是,我收到一个值错误:

list_of_dataframes = [df0[(df0['Names'].values == my_list_of_names[i])] for i in range(len(my_list_of_names))]

File "/home/josep/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 2986, in getitem return self._getitem_bool_array(key)

File "/home/josep/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 3033, in _getitem_bool_array "Item wrong length %d instead of %d." % (len(key), len(self.index))

ValueError: Item wrong length 233 instead of 234.

对于列表理解,语法如下:

new_list = []
for i in old_list:
    if filter(i):
        new_list.append(expressions(i))

重写为:new_list = [expression(i) for i in old_list if filter(i)]

那么,现在我的 for 朋友是:

my_list_of_names = pd.DataFrame('0': ['Jou', 'Lara'])
d = {'Names': ['John', 'Lara', 'Ari', 'Jou'], 'col2': [1, 2, 2, 2], 'col3': [1, 2 ,3, 4], 'col4': [2,1,1,1,], 'col5': [2,1,0,0], 'col6': [2,1,3,1]}
df0 = pd.DataFrame(data=d)

list_of_dataframes = []
for i in range(len(my_list_of_names)):
    df_i = df0[(df0['Names'].values ==
                  my_list_of_names.values[i])]
    list_of_dataframes.append(df_i)

可以写成:

list_of_dataframes = [df0[(df0['Names'].values == 
 my_list_of_names.values[i])] for i in range(len(my_list_of_names))]

而且这完全没问题。但是,如果我尝试通过将 DataFrame my_list_of_names 的类型更改为 list 类型来简化我的代码:

my_list_of_names2 = ['Jou', 'Lara']  # IS A LIST
list_of_df = [df0[(df0['Names'].values ==
                   my_list_of_names2[measure])
                  ] for measure in range(len(my_list_of_names2))]

它引发了一个值错误:

runcell(7, '~/sample.py') Traceback (most recent call last):

File "~/sample.py", line 263, in for measure in range(len(my_list_of_names2))]

File "~/sample.py", line 263, in for measure in range(len(my_list_of_names2))]

File "~home/josep/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 2986, in getitem return self._getitem_bool_array(key)

File "/home/josep/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 3033, in _getitem_bool_array "Item wrong length %d instead of %d." % (len(key), len(self.index))

ValueError: Item wrong length 233 instead of 234.

注意:真正的列表和数据框是不同的,但为了这个问题,我认为更短的更容易。

这可能不是最佳解决方案,但我相信它可以解决您的直接示例。

for name in my_list_of_names:
    df_i = df0[df0['Names'] == name]
    list_of_dataframes.append(df_i)