从值范围中获取值

Grab value from range of values

我有这个

df=pd.DataFrame({
                 'm':[16,23,24,26,17,18],
                     'A':['23-24','23-24','23-24','23-24','16-18','20']
                     })

我需要创建列 ['AC'],它根据列 ['m'] 从列 ['A'] 中获取一个值,如下所示:

if col['m'] = col['A'], then value from col['m']
if col['m'] =< col['A'] first value of the range, then the first value from col['A'] range
if col['m'] = is a value inside range displayed on col['A'], then value from col['m'] 
if col['m'] => col['A'] last value of the range, then last from col['A'] range

这是想要的结果:

m     A     AC
16  23-24   23
23  23-24   23
24  23-24   24
26  23-24   24
17  16-18   17
18    20    20

我试过下面的代码,但没有用

df[['a1','a2','a3']] = df['A'].str.split('-',expand=True)
df['a1']= df['a1'].replace([None],np.nan)
df['a1']= df['a1'].replace(np.nan,df.A)
df['a2']= df['a2'].replace([None],np.nan)
df['a2']= df['a2'].replace(np.nan,df.A)
df['a3']= df['a3'].replace([None],np.nan)
df['a3']= df['a3'].replace(np.nan,df.A)


df['a1']= pd.to_numeric(df['a1'], errors='coerce')
df['a2']= pd.to_numeric(df['a2'], errors='coerce')
df['a3']= pd.to_numeric(df['a3'], errors='coerce')

conds = [
(df['m']<= df['a1']),   
(df['m']> df['a1']),
(df['m']> df['a1'])&(df['m']< df['a2'])]

choices = [df['a1'],df['a2'],df['a3']]

df['AC'] = np.select(conds, choices)

我认为这应该可行。不管r1和r2是否存在,都可以进一步添加条件

import numpy as np

def get_AC(x, y):
    #x: value of column m
    #y: value of column A
    r1 = int(y.split("-")[0])
    r2 = int(y.split("-")[-1])

    if x == r1 == r2:
        return x
    elif x <= r1:
        return r1
    elif x > r1 and x < r2:
        return x
    elif x >= r2:
        return r2

df["AC"] = np.vectorize(get_AC)(df["m"], df["A"])
import pandas as pd


def create_AC(row):
    lower, upper = row.A.split("-")
    lower = int(lower)
    upper = int(upper)
    if row.m <= lower:
        return lower
    elif row.m >= upper:
        return upper
    else:
        return row.m
df=pd.DataFrame({
                 'm':[16,23,24,26,17,18],
                     'A':['23-24','23-24','23-24','23-24','16-18','20-20']
                     })
    
df["AC"] = df.apply(create_AC, axis=1)

你走在正确的道路上。只需对您的代码稍作改动即可满足您的需求:

df[['a1','a2']] = df['A'].str.split('-',expand=True).fillna(value=np.nan).astype(float)

conditions = [df["m"]==df["a1"], 
              df["m"]<=df["a1"], 
              (df["m"]>df["a1"]) & (df["m"]<df["a2"]),
              df["m"] >= df["a2"]]

choices = [df["m"], df["a1"], df["m"], df["a2"]]
df["AC"] = np.select(conditions, choices).astype(int)

df = df[["m", "A", "AC"]]
>>> df
    m      A  AC
0  16  23-24  23
1  23  23-24  23
2  24  23-24  24
3  26  23-24  24
4  17  16-18  17
5  18     20  20

您的代码已关闭。查看我的编辑和简单的逻辑。

df=pd.DataFrame({'m': [16, 23, 24, 26, 17, 18],
                 'A': ['23-24', '23-24', '23-24', '23-24', '16-18', '20']
                 })
df[['a1','a2']] = df['A'].str.split("-", expand=True)
mask = df['a2'].isnull()
df.loc[mask, 'a2'] = df.loc[mask, 'a1']

# fix dtypes
df['a1']= pd.to_numeric(df['a1'], errors='coerce')
df['a2']= pd.to_numeric(df['a2'], errors='coerce')

# logic
df['AC'] = df[(df.m >= df.a1) & (df.m <= df.a2)].m

mask = (df.m < df.a1)
df.loc[mask, 'AC'] = df.loc[mask].a1

mask = (df.m > df.a2)
df.loc[mask, 'AC'] = df.loc[mask].a2