在列中查找连续的非负块
Finding chunks of consecutive non negative in a columns
我有一个如下所示的数据框:
DateTime Rev
8/9/2019 4:00 -1976
8/9/2019 6:00 -1953
8/9/2019 7:00 -1838
8/9/2019 8:00 0
8/9/2019 11:00 0
8/9/2019 12:00 0
8/9/2019 13:00 -1945
8/9/2019 14:00 -2045
8/9/2019 15:00 -1976
8/9/2019 16:00 0
8/9/2019 17:00 0
我想找到所有非零数据块的索引。即我需要从上面 2 个不同的数据帧中提取
DateTime Rev
8/9/2019 4:00 -1976
8/9/2019 6:00 -1953
8/9/2019 7:00 -1838
和
DateTime Rev
8/9/2019 13:00 -1945
8/9/2019 14:00 -2045
8/9/2019 15:00 -1976
我不确定如何有效地解决这个问题。
Ide 是为负连续值创建唯一组,在列表理解中创建 DataFrame
s:
的列表
m = df['Rev'].lt(0)
mask = m.ne(m.shift()).cumsum()[m]
dfs = [g for i, g in df.groupby(mask)]
print (dfs)
[ DateTime Rev
0 8/9/2019 4:00 -1976
1 8/9/2019 6:00 -1953
2 8/9/2019 7:00 -1838, DateTime Rev
6 8/9/2019 13:00 -1945
7 8/9/2019 14:00 -2045
8 8/9/2019 15:00 -1976]
print (dfs[0])
DateTime Rev
0 8/9/2019 4:00 -1976
1 8/9/2019 6:00 -1953
2 8/9/2019 7:00 -1838
print (dfs[1])
DateTime Rev
6 8/9/2019 13:00 -1945
7 8/9/2019 14:00 -2045
8 8/9/2019 15:00 -1976
详情:
print (mask)
0 1
1 1
2 1
6 3
7 3
8 3
Name: Rev, dtype: int32
检查群组:
df['g'] = mask
print (df)
DateTime Rev g
0 8/9/2019 4:00 -1976 1.0
1 8/9/2019 6:00 -1953 1.0
2 8/9/2019 7:00 -1838 1.0
3 8/9/2019 8:00 0 NaN
4 8/9/2019 11:00 0 NaN
5 8/9/2019 12:00 0 NaN
6 8/9/2019 13:00 -1945 3.0
7 8/9/2019 14:00 -2045 3.0
8 8/9/2019 15:00 -1976 3.0
9 8/9/2019 16:00 0 NaN
10 8/9/2019 17:00 0 NaN
我有一个如下所示的数据框:
DateTime Rev
8/9/2019 4:00 -1976
8/9/2019 6:00 -1953
8/9/2019 7:00 -1838
8/9/2019 8:00 0
8/9/2019 11:00 0
8/9/2019 12:00 0
8/9/2019 13:00 -1945
8/9/2019 14:00 -2045
8/9/2019 15:00 -1976
8/9/2019 16:00 0
8/9/2019 17:00 0
我想找到所有非零数据块的索引。即我需要从上面 2 个不同的数据帧中提取
DateTime Rev
8/9/2019 4:00 -1976
8/9/2019 6:00 -1953
8/9/2019 7:00 -1838
和
DateTime Rev
8/9/2019 13:00 -1945
8/9/2019 14:00 -2045
8/9/2019 15:00 -1976
我不确定如何有效地解决这个问题。
Ide 是为负连续值创建唯一组,在列表理解中创建 DataFrame
s:
m = df['Rev'].lt(0)
mask = m.ne(m.shift()).cumsum()[m]
dfs = [g for i, g in df.groupby(mask)]
print (dfs)
[ DateTime Rev
0 8/9/2019 4:00 -1976
1 8/9/2019 6:00 -1953
2 8/9/2019 7:00 -1838, DateTime Rev
6 8/9/2019 13:00 -1945
7 8/9/2019 14:00 -2045
8 8/9/2019 15:00 -1976]
print (dfs[0])
DateTime Rev
0 8/9/2019 4:00 -1976
1 8/9/2019 6:00 -1953
2 8/9/2019 7:00 -1838
print (dfs[1])
DateTime Rev
6 8/9/2019 13:00 -1945
7 8/9/2019 14:00 -2045
8 8/9/2019 15:00 -1976
详情:
print (mask)
0 1
1 1
2 1
6 3
7 3
8 3
Name: Rev, dtype: int32
检查群组:
df['g'] = mask
print (df)
DateTime Rev g
0 8/9/2019 4:00 -1976 1.0
1 8/9/2019 6:00 -1953 1.0
2 8/9/2019 7:00 -1838 1.0
3 8/9/2019 8:00 0 NaN
4 8/9/2019 11:00 0 NaN
5 8/9/2019 12:00 0 NaN
6 8/9/2019 13:00 -1945 3.0
7 8/9/2019 14:00 -2045 3.0
8 8/9/2019 15:00 -1976 3.0
9 8/9/2019 16:00 0 NaN
10 8/9/2019 17:00 0 NaN