从 pandas 中的数据框行删除 Na

Question

如何使用 pandas 在 df 输入中删除 NAN 以获得仅包含数组

中的值的列表

**Input**
A |B |C 
-----------
x |y | NA
x |NA| NA
X |Y [ NA

输出

[[x,y],
 [X],
 [x,y]
]

它尝试传递每一行：

dataset.apply(lambda row: row[pd.notna(row)],axis=0).to_numpy()

array([["Belkin 325VA UPS Surge Protector, 6'",
        'Master Caster Door Stop, Large Neon Orange',
        'Easy-staple paper', 'Polycom VVX 310 VoIP phone',
        'Acco Banker\'s Clasps, 5 3/4"-Long',
        'Verbatim 25 GB 6x Blu-ray Single Layer Recordable Disc, 1/Pack',
        'Fellowes Advanced Computer Series Surge Protectors',
        'GBC DocuBind 200 Manual Binding Machine',
        'Tenex Personal Project File with Scoop Front Design, Black',
        'Avery Binding System Hidden Tab Executive Style Index Sets',
        'High Speed Automatic Electric Letter Opener', nan, nan, nan,
        nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
        nan, nan, nan, nan, nan, nan, nan, nan, nan, nan],

你能解释一下最好的方法吗？

Answer 1

你可以试试：

out=df.agg(lambda x:list(x.dropna()),axis=1).tolist()
#you can also use apply() in place of agg() method
#If you need array then instead of tolist() use values attribute or to_numpy() method
out=df.agg(lambda x:list(x.dropna()),axis=1).values

out的输出：

[['x', 'y'], ['x'], ['X', 'Y']]

从 pandas 中的数据框行删除 Na

drop Na from dataframe row in pandas

python

numpy

pandas

na