格式化满足条件的行 Pandas Python

Formatting rows that satisfy conditions Pandas Python

我正在尝试格式化 input.csv 中的数据,使其 returns 满足 Indexes 条件的索引。我希望代码打印出具有 #intervals 中定义的所有值的行的所有索引。只有元素 [ 2. 2. 30.] 的第一行满足 Indexes 变量定义的限制。本质上我想打印出所有行:满足条件:if column['MxU'] >= MxU and column['SNPT'] >= SNPT..... and column['MxD'] >= MxD

input.csv 文件:

element,LNPT,SNPT,NLP,NSP,TNT,TPnL,MxPnL,MnPnL,MxU,MxD
[ 2.  2. 30.],0,0,4,4,8,-0.1,-0.0,-0.1,17127,-3
[ 2.  2. 40.],0,0,2,2,4,0.0,-0.0,-0.0,17141,-3
[ 2.  2. 50.],0,0,2,2,4,0.0,-0.0,-0.0,17139,-3
[ 2.  2. 60.],2,0,6,6,12,0.5,2.3,-1.9,17015,-3
[ 2.  2. 70.],1,0,4,4,8,0.3,0.3,-0.0,17011,-3

代码:

df = pd.read_csv('input.csv')

#intervals
MxU= 17100
SNPT= 1
NLP= 3
MnPnL= -0.1
MxD= 0

#variables used for formatting
Indexes = [MxU,SNPT,NLP,MnPnL,MxD]
#all columns in csv listed
columns = ['LNPT', 'SNPT', 'NLP', 'MxPnL', 'NSP', 'TNT', 'TPnL', 'MnPnL','MxU','MxD']

def intersect() :#function for 
    for i in columns:
        if str(Indexes[i]) in columns[i]:
            for k in Indexes:
                formating = df[columns] >= Indexes[k]
        
    
intersect() #calling function

预期输出:

row: 1 element:[ 2.  2. 30.]

如果我明白你的意思,你可以将列间隔存储在字典中。然后循环遍历要检查的列以与区间字典进行比较。

您可以使用 np.logical_and and reduce 来简化循环。

import numpy as np

intervals = {
    'MxU': 17100,
    'SNPT': 1,
    'NLP': 3,
    'MnPnL': -0.1,
    'MxD': 0
}

columns = ['NLP', 'MnPnL']

mask = np.logical_and.reduce([df[col] >= intervals[col] for col in columns])

然后使用布尔索引 select 所需的行

df_ = df.loc[mask, 'element']
# print(df_)

0    [ 2.  2. 30.]
4    [ 2.  2. 70.]
Name: element, dtype: object