如何在 python 的列中过滤所有包含 ''isolated'' nan 值的行

Question

我在 pandas 数据框中有一列，其中一些行具有 NaN 值。

我想要 select 满足这些条件的行 :
- 它们是 NaN 值；
- 它们直接跟在后面或在非空值之前

例如，我想 select 具有此 nan 值的行：
输入：

索引 |上校

...
1 | 1344
2 |南
3 | 532
...

期望的输出：
2 | NaN

但我不想 select 这些 nan 值（因为它们后跟一个 NaN 值或紧跟在另一个 NaN 值之后）：

索引 |上校

...
1 | 1344
2 |南
3 |南
4 | 532

...

如有任何帮助，我们将不胜感激

谢谢！

Answer 1

下面我将向您展示如何用 example.On 一方面，Series.notna + Series.cumsum + Series.shift is used to group consecutive NaN values through groupby. Using transform you get a Boolean Series with False in those groups that have more than one NaN. the AND operation of this Boolean series with the resulting series of df2['col2']. isna() is the series we are looking for to perform the Boolean indexing 和 select 那些有 NaN 但不是连续

的行

df=pd.DataFrame({'col1':[1,2,3,4,5,6,7,8,9,10],'col2':[np.nan,2,3,np.nan,np.nan,6,np.nan,8,9,np.nan]})
print(df)
   col1  col2
0     1   NaN
1     2   2.0
2     3   3.0
3     4   NaN
4     5   NaN
5     6   6.0
6     7   NaN
7     8   8.0
8     9   9.0
9    10   NaN

mask_repeat_NaN=df.groupby(df['col2'].notna().cumsum())['col2'].transform('size').le(2)
mask=mask_repeat_NaN&df['col2'].isna()
df_filtered=df[mask]
print(df_filtered)

   col1  col2
0     1   NaN
6     7   NaN
9    10   NaN

如何在 python 的列中过滤所有包含 ''isolated'' nan 值的行

How to filter all the rows that contain ''isolated'' nan values in a column in python

python

nan

filter

pandas