格式化满足条件的行 Pandas Python
Formatting rows that satisfy conditions Pandas Python
我正在尝试格式化 input.csv
中的数据,使其 returns 满足 Indexes
条件的索引。我希望代码打印出具有 #intervals
中定义的所有值的行的所有索引。只有元素 [ 2. 2. 30.]
的第一行满足 Indexes
变量定义的限制。本质上我想打印出所有行:满足条件:if column['MxU'] >= MxU and column['SNPT'] >= SNPT..... and column['MxD'] >= MxD
input.csv 文件:
element,LNPT,SNPT,NLP,NSP,TNT,TPnL,MxPnL,MnPnL,MxU,MxD
[ 2. 2. 30.],0,0,4,4,8,-0.1,-0.0,-0.1,17127,-3
[ 2. 2. 40.],0,0,2,2,4,0.0,-0.0,-0.0,17141,-3
[ 2. 2. 50.],0,0,2,2,4,0.0,-0.0,-0.0,17139,-3
[ 2. 2. 60.],2,0,6,6,12,0.5,2.3,-1.9,17015,-3
[ 2. 2. 70.],1,0,4,4,8,0.3,0.3,-0.0,17011,-3
代码:
df = pd.read_csv('input.csv')
#intervals
MxU= 17100
SNPT= 1
NLP= 3
MnPnL= -0.1
MxD= 0
#variables used for formatting
Indexes = [MxU,SNPT,NLP,MnPnL,MxD]
#all columns in csv listed
columns = ['LNPT', 'SNPT', 'NLP', 'MxPnL', 'NSP', 'TNT', 'TPnL', 'MnPnL','MxU','MxD']
def intersect() :#function for
for i in columns:
if str(Indexes[i]) in columns[i]:
for k in Indexes:
formating = df[columns] >= Indexes[k]
intersect() #calling function
预期输出:
row: 1 element:[ 2. 2. 30.]
如果我明白你的意思,你可以将列间隔存储在字典中。然后循环遍历要检查的列以与区间字典进行比较。
您可以使用 np.logical_and and reduce
来简化循环。
import numpy as np
intervals = {
'MxU': 17100,
'SNPT': 1,
'NLP': 3,
'MnPnL': -0.1,
'MxD': 0
}
columns = ['NLP', 'MnPnL']
mask = np.logical_and.reduce([df[col] >= intervals[col] for col in columns])
然后使用布尔索引 select 所需的行
df_ = df.loc[mask, 'element']
# print(df_)
0 [ 2. 2. 30.]
4 [ 2. 2. 70.]
Name: element, dtype: object
我正在尝试格式化 input.csv
中的数据,使其 returns 满足 Indexes
条件的索引。我希望代码打印出具有 #intervals
中定义的所有值的行的所有索引。只有元素 [ 2. 2. 30.]
的第一行满足 Indexes
变量定义的限制。本质上我想打印出所有行:满足条件:if column['MxU'] >= MxU and column['SNPT'] >= SNPT..... and column['MxD'] >= MxD
input.csv 文件:
element,LNPT,SNPT,NLP,NSP,TNT,TPnL,MxPnL,MnPnL,MxU,MxD
[ 2. 2. 30.],0,0,4,4,8,-0.1,-0.0,-0.1,17127,-3
[ 2. 2. 40.],0,0,2,2,4,0.0,-0.0,-0.0,17141,-3
[ 2. 2. 50.],0,0,2,2,4,0.0,-0.0,-0.0,17139,-3
[ 2. 2. 60.],2,0,6,6,12,0.5,2.3,-1.9,17015,-3
[ 2. 2. 70.],1,0,4,4,8,0.3,0.3,-0.0,17011,-3
代码:
df = pd.read_csv('input.csv')
#intervals
MxU= 17100
SNPT= 1
NLP= 3
MnPnL= -0.1
MxD= 0
#variables used for formatting
Indexes = [MxU,SNPT,NLP,MnPnL,MxD]
#all columns in csv listed
columns = ['LNPT', 'SNPT', 'NLP', 'MxPnL', 'NSP', 'TNT', 'TPnL', 'MnPnL','MxU','MxD']
def intersect() :#function for
for i in columns:
if str(Indexes[i]) in columns[i]:
for k in Indexes:
formating = df[columns] >= Indexes[k]
intersect() #calling function
预期输出:
row: 1 element:[ 2. 2. 30.]
如果我明白你的意思,你可以将列间隔存储在字典中。然后循环遍历要检查的列以与区间字典进行比较。
您可以使用 np.logical_and and reduce
来简化循环。
import numpy as np
intervals = {
'MxU': 17100,
'SNPT': 1,
'NLP': 3,
'MnPnL': -0.1,
'MxD': 0
}
columns = ['NLP', 'MnPnL']
mask = np.logical_and.reduce([df[col] >= intervals[col] for col in columns])
然后使用布尔索引 select 所需的行
df_ = df.loc[mask, 'element']
# print(df_)
0 [ 2. 2. 30.]
4 [ 2. 2. 70.]
Name: element, dtype: object