如何在 pandas 系列中获得最接近零值的 n 个值?
How can I get the n closest to zero values in a pandas Series?
如何获得最接近 0
的 n
个值,类似于使用 nsmallest()
获得 n 最小值的方法。例如。与
series = pd.Series([-1.0,-0.75,-0.5,-0.25,0.25,0.5,0.75,1.0])
series
0 -1.00
1 -0.75
2 -0.50
3 -0.25
4 0.25
5 0.50
6 0.75
7 1.00
dtype: float64
例如n=4
我想得到以下信息。
0 -0.25
1 0.25
2 -0.50
3 0.50
dtype: float64
如果性能很重要,请使用 Series.abs
with Series.argsort
for positions, filter n
and select by Series.iloc
:
n = 4
series = series.iloc[series.abs().argsort()[:n]]
print (series)
3 -0.25
4 0.25
2 -0.50
5 0.50
dtype: float64
最后一个如果需要默认索引:
n = 4
series = series.iloc[series.abs().argsort()[:n]].reset_index(drop=True)
print (series)
0 -0.25
1 0.25
2 -0.50
3 0.50
dtype: float64
性能:
series = pd.Series([-1.0,-0.75,-0.5,-0.25,0.25,0.5,0.75,1.0] * 10000)
n = 4000
series = series.iloc[series.abs().argsort()[:n]]
print (series)
In [114]: %timeit series.iloc[series.abs().argsort()[:n]]
794 µs ± 19.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [115]: %timeit series.loc[series.abs().nsmallest(n).index]
2.09 ms ± 34.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
使用 loc
、abs
和 nsmallest
:
series.loc[series.abs().nsmallest(4).index]
3 -0.25
4 0.25
2 -0.50
5 0.50
dtype: float64
如何获得最接近 0
的 n
个值,类似于使用 nsmallest()
获得 n 最小值的方法。例如。与
series = pd.Series([-1.0,-0.75,-0.5,-0.25,0.25,0.5,0.75,1.0])
series
0 -1.00
1 -0.75
2 -0.50
3 -0.25
4 0.25
5 0.50
6 0.75
7 1.00
dtype: float64
例如n=4
我想得到以下信息。
0 -0.25
1 0.25
2 -0.50
3 0.50
dtype: float64
如果性能很重要,请使用 Series.abs
with Series.argsort
for positions, filter n
and select by Series.iloc
:
n = 4
series = series.iloc[series.abs().argsort()[:n]]
print (series)
3 -0.25
4 0.25
2 -0.50
5 0.50
dtype: float64
最后一个如果需要默认索引:
n = 4
series = series.iloc[series.abs().argsort()[:n]].reset_index(drop=True)
print (series)
0 -0.25
1 0.25
2 -0.50
3 0.50
dtype: float64
性能:
series = pd.Series([-1.0,-0.75,-0.5,-0.25,0.25,0.5,0.75,1.0] * 10000)
n = 4000
series = series.iloc[series.abs().argsort()[:n]]
print (series)
In [114]: %timeit series.iloc[series.abs().argsort()[:n]]
794 µs ± 19.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [115]: %timeit series.loc[series.abs().nsmallest(n).index]
2.09 ms ± 34.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
使用 loc
、abs
和 nsmallest
:
series.loc[series.abs().nsmallest(4).index]
3 -0.25
4 0.25
2 -0.50
5 0.50
dtype: float64