使用 pandas 在数据框系列中插入空值

Question

我有一个数字列表，我已将其指定为数据框系列，如下所示。

 [0.0,
 4.98,
 10.68,
 17.12,
 23.56,
 23.56,
 23.56,
 23.56,
 50.82,
 50.82,
 50.82,
 50.82,
 50.82,
 50.82,
 50.82,
 50.82,
 50.82,
 50.82,
 117.84,
 117.84,
 117.84,
 117.84,
 117.84,
 117.84,
 117.84,
 159.9,
 159.9,
 171.79,
 171.79,
 171.79,
 190.28,
 190.28,
 204.07,
 210.31,
 215.97,
 222.58]

我希望删除所有重复项并根据列表中不重复的现有数字插入缺失的 NaN 值。

经过我drop_duplicates，这就是我得到的

我继续 df.interpolate(method='linear') 但我找回了我的原始数字列表，并且没有插入缺失值。有什么可以帮助的想法吗？我的代码示例如下：

 dlist = [...]
 df = pd.DataFrame(dlist)
 df.drop_duplicates()
 df.interpolate(method='linear')

非常感谢。

Answer 1

试试这个：

a = pd.Series(yourlist)
a[a.duplicated()]  = None
a = a.interpolate(method='linear')

另一个解决方案：

将重复值替换为缺失值 Series.duplicated with Series.mask and and use Series.interpolate:

s = pd.Series(dlist)
s = s.mask(s.duplicated()).interpolate(method='linear')

print (s.head(10))
0     0.000
1     4.980
2    10.680
3    17.120
4    23.560
5    30.375
6    37.190
7    44.005
8    50.820
9    57.522
dtype: float64

使用 pandas 在数据框系列中插入空值

Interpolate empty values in a data frame series using pandas

python

interpolation

pandas