运行通过附加的数据框定位第一个正数在 python 中的位置

Question

我有一个使用 pandas 的附加系列。我将其称为 S。对于某些 i，每个 S[i] 都有 50 个数据点。我将这些称为 j.

我想遍历每个 i，例如对于 j=1，找到第一个正数 s[i][1] 出现的位置并记录数字是多少。因此，我正在寻找的输出是一个 i x 2 数据帧，其中 [i,1] 记录每个 i 的 j，[i,2] 记录正数是什么。

最好，我想要一个矢量化版本，例如 sapply/apply in R.

我希望描述有意义。我希望有人可以帮助我！

以下是 i=4 和 j=6 的示例。

S[0]:
2013-01-02_59   -0.004739
2013-01-02_61   +0.002435
2013-01-02_74   -0.004772
2013-01-02_75   -0.004772
2013-01-02_77   -0.002452
2013-01-02_78   -0.009423

S[1]:
2013-01-02_60   -0.007048
2013-01-02_62   -0.002435
2013-01-02_75   +0.004772
2013-01-02_76   -0.002446
2013-01-02_78   +0.007114
2013-01-02_79   -0.004772

S[2]: 
2013-01-02_61   -0.004739
2013-01-02_63   +0.002435
2013-01-02_76   -0.002446
2013-01-02_77   -0.004772
2013-01-02_79   -0.002452
2013-01-02_80   +0.002446

S[3]: 
2013-01-02_62   -0.004739
2013-01-02_64   +0.002435
2013-01-02_77   -0.004772
2013-01-02_78   +0.009423
2013-01-02_80   -0.000121
2013-01-02_81   -0.004772

我在这个例子中的期望输出是：

Output:
NA    NA
1     +0.002435
2     +0.004772
4     +0.009423
2     +0.007114
3     +0.002446

输出的第一行是 NA，因为它从来都不是正数。

Answer 1

下面将识别每个series的第一个正值的index和value，并在没有正值的情况下插入np.nan。一些示例数据：

df = pd.DataFrame()
for i in range(10):
    df = pd.concat([df, pd.Series(data=np.random.uniform(-1, 1, 50), name=i)], axis=1)

df = df.transpose()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 10 entries, 0 to 9
Data columns (total 50 columns):
0     10 non-null float64
1     10 non-null float64
2     10 non-null float64
3     10 non-null float64
4     10 non-null float64
5     10 non-null float64
....
45    10 non-null float64
46    10 non-null float64
47    10 non-null float64
48    10 non-null float64
49    10 non-null float64
dtypes: float64(50)

使用： df.loc[3, :] = -1

tmp = df.apply(lambda x: pd.DataFrame({'value': x[x > 0]}).reset_index().iloc[0] if not x[x > 0].empty else (x.index[-1], np.nan), axis=1)

为每个原始 series i 在 columns 中获取 index、values 对，后者由索引引用：

   index     value
0      1  0.608962
1      2  0.487893
2      1  0.850135
3     49       NaN
4      1  0.870091
5      2  0.469713
6      1  0.331851
7      0  0.036980
8      0  0.387298
9      3  0.723645

运行通过附加的数据框定位第一个正数在 python 中的位置

Running through appended data frame to locate where the first positive number is in python

python

list

append

pandas

运行 通过附加的数据框定位第一个正数在 python 中的位置

Running through appended data frame to locate where the first positive number is in python

python

list

append

pandas

运行通过附加的数据框定位第一个正数在 python 中的位置