如何根据升序过滤列表?
How to filter a list based on ascending values?
我有以下 3 个列表:
minimal_values = ['0,32', '0,35', '0,45']
maximal_values = ['0,78', '0,85', '0,72']
my_list = [
['Morocco', 'Meat', '190,00', '0,15'],
['Morocco', 'Meat', '189,90', '0,32'],
['Morocco', 'Meat', '189,38', '0,44'],
['Morocco', 'Meat', '188,94', '0,60'],
['Morocco', 'Meat', '188,49', '0,78'],
['Morocco', 'Meat', '187,99', '0,70'],
['Spain', 'Meat', '190,76', '0,10'],
['Spain', 'Meat', '190,16', '0,20'],
['Spain', 'Meat', '189,56', '0,35'],
['Spain', 'Meat', '189,01', '0,40'],
['Spain', 'Meat', '188,13', '0,75'],
['Spain', 'Meat', '187,95', '0,85'],
['Italy', 'Meat', '190,20', '0,11'],
['Italy', 'Meat', '190,10', '0,31'],
['Italy', 'Meat', '189,32', '0,45'],
['Italy', 'Meat', '188,61', '0,67'],
['Italy', 'Meat', '188,01', '0,72'],
['Italy', 'Meat', '187,36', '0,55']]
如果 index [-1]
介于 minimal_values
中的值和 maximal_values
中的值之间,我正在尝试过滤 my_list
。这些值是最小值和最大按国家。我还在列表中做减法。所以对于摩洛哥我只想要 index[-1]
在 0,32
和 0,78
等之间的行。问题是在 0,78
之后值下降到 0,70
这意味着该行也满足 if 语句。
注意:my_list
-1
中的值是先升后降。我只想要上升部分的行,而不是下降部分的行。我不确定如何解决这个问题。
这是我的代码:
price = 500
# Convert values to float.
minimal_values = [float(i.replace(',', '.')) for i in minimal_values]
maximal_values = [float(i.replace(',', '.')) for i in maximal_values]
# Collect all unique countries in a list.
countries = list(set(country[0] for country in my_list))
results = []
for l in my_list:
i = countries.index(l[0])
if minimal_values[i] <= float(l[-1].replace(',', '.')) <= maximal_values[i]:
new_index_2 = price - float(l[-2].replace(',', '.'))
l[-2] = new_index_2
results.append(l)
print(results)
这是我当前的输出:
[['Morocco', 'Meat', '189.90', '0,32'],
['Morocco', 'Meat', 310.62, '0,44'],
['Morocco', 'Meat', 311.06, '0,60'],
['Morocco', 'Meat', 311.51, '0,78'],
['Morocco', 'Meat', 312.01, '0,70'],
['Spain', 'Meat', 310.44, '0,35'],
['Spain', 'Meat', 310.99, '0,40'],
['Spain', 'Meat', 311.87, '0,75'],
['Spain', 'Meat', '312.05', '0,85'],
['Italy', 'Meat', 310.68, '0,45'],
['Italy', 'Meat', 311.39, '0,67'],
['Italy', 'Meat', 311.99, '0,72'],
['Italy', 'Meat', 312.64, '0,55']]
这是我想要的输出:
[['Morocco', 'Meat', '189.90', '0,32'],
['Morocco', 'Meat', 310.62, '0,44'],
['Morocco', 'Meat', 311.06, '0,60'],
['Morocco', 'Meat', 311.51, '0,78'],
['Spain', 'Meat', 310.44, '0,35'],
['Spain', 'Meat', 310.99, '0,40'],
['Spain', 'Meat', 311.87, '0,75'],
['Spain', 'Meat', '312.05', '0,85'],
['Italy', 'Meat', 310.68, '0,45'],
['Italy', 'Meat', 311.39, '0,67'],
['Italy', 'Meat', 311.99, '0,72']]
*****Pandas也欢迎相关回答
minimal_values = [float(i.replace(',', '.')) for i in minimal_values]
maximal_values = [float(i.replace(',', '.')) for i in maximal_values]
countries_largest = {}
filtered_list = []
for row in my_list:
country_name = row[0]
value = float(row[-1].replace(',','.'))
if country_name in countries_largest and value < countries_largest[country_name]:
continue
countries_largest[country_name] = value
if not (minimal_values[len(countries_largest)-1] <= value <= maximal_values[len(countries_largest)-1]):
continue
filtered_list.append(row)
[['Morocco', 'Meat', '189,90', '0,32'],
['Morocco', 'Meat', '189,38', '0,44'],
['Morocco', 'Meat', '188,94', '0,60'],
['Morocco', 'Meat', '188,49', '0,78'],
['Spain', 'Meat', '189,56', '0,35'],
['Spain', 'Meat', '189,01', '0,40'],
['Spain', 'Meat', '188,13', '0,75'],
['Spain', 'Meat', '187,95', '0,85'],
['Italy', 'Meat', '189,32', '0,45'],
['Italy', 'Meat', '188,61', '0,67'],
['Italy', 'Meat', '188,01', '0,72']]
请注意,您的代码存在问题,因为 countries
的元素顺序不一定与 my_list
中的国家/地区顺序相同。在处理列表时处理国家更容易,在国家名称更改时记下。然后,您可以在循环中添加一个标志,指示该国家/地区的处理已完成(当当前值小于先前值时),如果是这样,则忽略该国家/地区的剩余值:
# Convert values to float.
minimal_values = [float(i.replace(',', '.')) for i in minimal_values]
maximal_values = [float(i.replace(',', '.')) for i in maximal_values]
# Collect all unique countries in a list.
results = []
finished_country = -1
country_index = -1
last_country = ''
for l in my_list:
country = l[0]
if country != last_country:
country_index += 1
last_country = country
value = float(l[-1].replace(',', '.'))
if finished_country == country_index or value < minimal_values[country_index]:
last_value = 0
continue
if value < last_value:
finished_country = country_index
elif value <= maximal_values[country_index]:
new_index_2 = price - float(l[-2].replace(',', '.'))
l[-2] = new_index_2
results.append(l)
last_value = value
示例数据的输出:
[
['Morocco', 'Meat', 310.1, '0,32'],
['Morocco', 'Meat', 310.62, '0,44'],
['Morocco', 'Meat', 311.06, '0,60'],
['Morocco', 'Meat', 311.51, '0,78'],
['Spain', 'Meat', 310.44, '0,35'],
['Spain', 'Meat', 310.99, '0,40'],
['Spain', 'Meat', 311.87, '0,75'],
['Spain', 'Meat', 312.05, '0,85'],
['Italy', 'Meat', 310.68, '0,45'],
['Italy', 'Meat', 311.39, '0,67'],
['Italy', 'Meat', 311.99, '0,72']
]
pandas 解决方案:
import pandas as pd
import numpy as np
# create input dataframe
my_list = [
['Morocco', 'Meat', '190,00', '0,15'],
['Morocco', 'Meat', '189,90', '0,32'],
['Morocco', 'Meat', '189,38', '0,44'],
['Morocco', 'Meat', '188,94', '0,60'],
['Morocco', 'Meat', '188,49', '0,78'],
['Morocco', 'Meat', '187,99', '0,70'],
['Spain', 'Meat', '190,76', '0,10'],
['Spain', 'Meat', '190,16', '0,20'],
['Spain', 'Meat', '189,56', '0,35'],
['Spain', 'Meat', '189,01', '0,40'],
['Spain', 'Meat', '188,13', '0,75'],
['Spain', 'Meat', '187,95', '0,85'],
['Italy', 'Meat', '190,20', '0,11'],
['Italy', 'Meat', '190,10', '0,31'],
['Italy', 'Meat', '189,32', '0,45'],
['Italy', 'Meat', '188,61', '0,67'],
['Italy', 'Meat', '188,01', '0,72'],
['Italy', 'Meat', '187,36', '0,55']]
dfi = pd.DataFrame(my_list).applymap(lambda x: x.replace(',', '.'))
dfi[[2, 3]] = dfi[[2, 3]].astype(float)
print(dfi)
# 0 1 2 3
# 0 Morocco Meat 190.00 0.15
# 1 Morocco Meat 189.90 0.32
# 2 Morocco Meat 189.38 0.44
# 3 Morocco Meat 188.94 0.60
# 4 Morocco Meat 188.49 0.78
# 5 Morocco Meat 187.99 0.70
# 6 Spain Meat 190.76 0.10
# 7 Spain Meat 190.16 0.20
# 8 Spain Meat 189.56 0.35
# 9 Spain Meat 189.01 0.40
# 10 Spain Meat 188.13 0.75
# 11 Spain Meat 187.95 0.85
# 12 Italy Meat 190.20 0.11
# 13 Italy Meat 190.10 0.31
# 14 Italy Meat 189.32 0.45
# 15 Italy Meat 188.61 0.67
# 16 Italy Meat 188.01 0.72
# 17 Italy Meat 187.36 0.55
# create df_filter with contry and min_v, max_v
minimal_values = ['0,32', '0,35', '0,45']
maximal_values = ['0,78', '0,85', '0,72']
minimal_values = [float(i.replace(',', '.')) for i in minimal_values]
maximal_values = [float(i.replace(',', '.')) for i in maximal_values]
df_filter = pd.DataFrame(list(zip(dfi[0].unique().tolist(),
minimal_values,
maximal_values)))
df_filter.columns = [0, 'min_v', 'max_v']
print(df_filter)
# 0 min_v max_v
# 0 Morocco 0.32 0.78
# 1 Spain 0.35 0.85
# 2 Italy 0.45 0.72
# merge dfi and fi_filter
dfm = pd.merge(dfi, df_filter, on=0, how='left')
print(dfm)
# 0 1 2 3 min_v max_v
# 0 Morocco Meat 190.00 0.15 0.32 0.78
# 1 Morocco Meat 189.90 0.32 0.32 0.78
# 2 Morocco Meat 189.38 0.44 0.32 0.78
# 3 Morocco Meat 188.94 0.60 0.32 0.78
# 4 Morocco Meat 188.49 0.78 0.32 0.78
# 5 Morocco Meat 187.99 0.70 0.32 0.78
# 6 Spain Meat 190.76 0.10 0.35 0.85
# 7 Spain Meat 190.16 0.20 0.35 0.85
# 8 Spain Meat 189.56 0.35 0.35 0.85
# 9 Spain Meat 189.01 0.40 0.35 0.85
# 10 Spain Meat 188.13 0.75 0.35 0.85
# 11 Spain Meat 187.95 0.85 0.35 0.85
# 12 Italy Meat 190.20 0.11 0.45 0.72
# 13 Italy Meat 190.10 0.31 0.45 0.72
# 14 Italy Meat 189.32 0.45 0.45 0.72
# 15 Italy Meat 188.61 0.67 0.45 0.72
# 16 Italy Meat 188.01 0.72 0.45 0.72
# 17 Italy Meat 187.36 0.55 0.45 0.72
# filter min_v <= column 3 <= max_v
cond = dfm[3].ge(dfm.min_v) & dfm[3].le(dfm.max_v)
dfm = dfm[cond].copy()
# filter 3 that is not ascending
cond = dfm.groupby(0)[3].diff() < 0
dfo = dfm.loc[~cond, [0,1,2,3]].reset_index(drop=True)
# outut result
price = 500
dfo[2] = price - dfo[2]
print(dfo)
# 0 1 2 3
# 0 Morocco Meat 310.10 0.32
# 1 Morocco Meat 310.62 0.44
# 2 Morocco Meat 311.06 0.60
# 3 Morocco Meat 311.51 0.78
# 4 Spain Meat 310.44 0.35
# 5 Spain Meat 310.99 0.40
# 6 Spain Meat 311.87 0.75
# 7 Spain Meat 312.05 0.85
# 8 Italy Meat 310.68 0.45
# 9 Italy Meat 311.39 0.67
# 10 Italy Meat 311.99 0.72
给定:
minimal_values = ['0,32', '0,35', '0,45']
maximal_values = ['0,78', '0,85', '0,72']
my_list = [
['Morocco', 'Meat', '190,00', '0,15'],
['Morocco', 'Meat', '189,90', '0,32'],
['Morocco', 'Meat', '189,38', '0,44'],
['Morocco', 'Meat', '188,94', '0,60'],
['Morocco', 'Meat', '188,49', '0,78'],
['Morocco', 'Meat', '187,99', '0,70'],
['Spain', 'Meat', '190,76', '0,10'],
['Spain', 'Meat', '190,16', '0,20'],
['Spain', 'Meat', '189,56', '0,35'],
['Spain', 'Meat', '189,01', '0,40'],
['Spain', 'Meat', '188,13', '0,75'],
['Spain', 'Meat', '187,95', '0,85'],
['Italy', 'Meat', '190,20', '0,11'],
['Italy', 'Meat', '190,10', '0,31'],
['Italy', 'Meat', '189,32', '0,45'],
['Italy', 'Meat', '188,61', '0,67'],
['Italy', 'Meat', '188,01', '0,72'],
['Italy', 'Meat', '187,36', '0,55']]
首先,由于我们将大量使用它,所以让我们编写一个小的转换例程来标准化我们在您的情况下 'float' 的含义:
def conv(s):
try:
return float(s.replace(',','.'))
except ValueError:
return s
现在看来,您的两个字符串列表 minimal_values
和 maximal_values
是按国家/地区映射到最小值和最大值。如果是这样,您对 countries = list(set(country[0] for country in my_list))
的使用将不起作用,因为集合在 Python.
的所有版本中都是任意顺序的
如果你有 Python 3.6+,你可以:
countries = list({}.fromkeys(country[0] for country in my_list))
因为字典在 Python 3.6+ 中保留了插入顺序。假设您想要适用于所有版本的 Python,您可以改为:
def uniqs_in_order(li):
seen=set()
return [e for e in li if not (e in seen or seen.add(e))]
# Python 3.6+: return list({}.fromkeys(li))
现在您可以为该国家/地区创建 min/max 值的 country:tuple 映射:
mapping={k:(min_, max_) for k,min_,max_ in
zip(uniqs_in_order([sl[0] for sl in my_list]),
[conv(s) for s in minimal_values],
[conv(s) for s in maximal_values])}
>>> mapping
{'Morocco': (0.32, 0.78), 'Spain': (0.35, 0.85), 'Italy': (0.45, 0.72)}
现在,我们终于可以过滤了。由于您只想采用以下值:
- 在国家/地区的最小值和最大值内,并且;
- 当国家/地区的值不再上升时停止。
我们可以使用 itertools 中的 groupby
来按国家划分列表列表并执行这两个测试:
from itertools import groupby
filt=[]
price = 500
for k,v in groupby(my_list, key=lambda sl: sl[0]):
section=list(v)
for i, row in enumerate(section):
if i and conv(row[-1])<conv(section[i-1][-1]):
break
if mapping[row[0]][0]<=conv(row[-1])<=mapping[row[0]][1]:
row[-2]=price-conv(row[-2])
filt.append(row)
>>> filt
[['Morocco', 'Meat', 310.1, '0,32'],
['Morocco', 'Meat', 310.62, '0,44'],
['Morocco', 'Meat', 311.06, '0,60'],
['Morocco', 'Meat', 311.51, '0,78'],
['Spain', 'Meat', 310.44, '0,35'],
['Spain', 'Meat', 310.99, '0,40'],
['Spain', 'Meat', 311.87, '0,75'],
['Spain', 'Meat', 312.05, '0,85'],
['Italy', 'Meat', 310.68, '0,45'],
['Italy', 'Meat', 311.39, '0,67'],
['Italy', 'Meat', 311.99, '0,72']]
我有以下 3 个列表:
minimal_values = ['0,32', '0,35', '0,45']
maximal_values = ['0,78', '0,85', '0,72']
my_list = [
['Morocco', 'Meat', '190,00', '0,15'],
['Morocco', 'Meat', '189,90', '0,32'],
['Morocco', 'Meat', '189,38', '0,44'],
['Morocco', 'Meat', '188,94', '0,60'],
['Morocco', 'Meat', '188,49', '0,78'],
['Morocco', 'Meat', '187,99', '0,70'],
['Spain', 'Meat', '190,76', '0,10'],
['Spain', 'Meat', '190,16', '0,20'],
['Spain', 'Meat', '189,56', '0,35'],
['Spain', 'Meat', '189,01', '0,40'],
['Spain', 'Meat', '188,13', '0,75'],
['Spain', 'Meat', '187,95', '0,85'],
['Italy', 'Meat', '190,20', '0,11'],
['Italy', 'Meat', '190,10', '0,31'],
['Italy', 'Meat', '189,32', '0,45'],
['Italy', 'Meat', '188,61', '0,67'],
['Italy', 'Meat', '188,01', '0,72'],
['Italy', 'Meat', '187,36', '0,55']]
如果 index [-1]
介于 minimal_values
中的值和 maximal_values
中的值之间,我正在尝试过滤 my_list
。这些值是最小值和最大按国家。我还在列表中做减法。所以对于摩洛哥我只想要 index[-1]
在 0,32
和 0,78
等之间的行。问题是在 0,78
之后值下降到 0,70
这意味着该行也满足 if 语句。
注意:my_list
-1
中的值是先升后降。我只想要上升部分的行,而不是下降部分的行。我不确定如何解决这个问题。
这是我的代码:
price = 500
# Convert values to float.
minimal_values = [float(i.replace(',', '.')) for i in minimal_values]
maximal_values = [float(i.replace(',', '.')) for i in maximal_values]
# Collect all unique countries in a list.
countries = list(set(country[0] for country in my_list))
results = []
for l in my_list:
i = countries.index(l[0])
if minimal_values[i] <= float(l[-1].replace(',', '.')) <= maximal_values[i]:
new_index_2 = price - float(l[-2].replace(',', '.'))
l[-2] = new_index_2
results.append(l)
print(results)
这是我当前的输出:
[['Morocco', 'Meat', '189.90', '0,32'],
['Morocco', 'Meat', 310.62, '0,44'],
['Morocco', 'Meat', 311.06, '0,60'],
['Morocco', 'Meat', 311.51, '0,78'],
['Morocco', 'Meat', 312.01, '0,70'],
['Spain', 'Meat', 310.44, '0,35'],
['Spain', 'Meat', 310.99, '0,40'],
['Spain', 'Meat', 311.87, '0,75'],
['Spain', 'Meat', '312.05', '0,85'],
['Italy', 'Meat', 310.68, '0,45'],
['Italy', 'Meat', 311.39, '0,67'],
['Italy', 'Meat', 311.99, '0,72'],
['Italy', 'Meat', 312.64, '0,55']]
这是我想要的输出:
[['Morocco', 'Meat', '189.90', '0,32'],
['Morocco', 'Meat', 310.62, '0,44'],
['Morocco', 'Meat', 311.06, '0,60'],
['Morocco', 'Meat', 311.51, '0,78'],
['Spain', 'Meat', 310.44, '0,35'],
['Spain', 'Meat', 310.99, '0,40'],
['Spain', 'Meat', 311.87, '0,75'],
['Spain', 'Meat', '312.05', '0,85'],
['Italy', 'Meat', 310.68, '0,45'],
['Italy', 'Meat', 311.39, '0,67'],
['Italy', 'Meat', 311.99, '0,72']]
*****Pandas也欢迎相关回答
minimal_values = [float(i.replace(',', '.')) for i in minimal_values]
maximal_values = [float(i.replace(',', '.')) for i in maximal_values]
countries_largest = {}
filtered_list = []
for row in my_list:
country_name = row[0]
value = float(row[-1].replace(',','.'))
if country_name in countries_largest and value < countries_largest[country_name]:
continue
countries_largest[country_name] = value
if not (minimal_values[len(countries_largest)-1] <= value <= maximal_values[len(countries_largest)-1]):
continue
filtered_list.append(row)
[['Morocco', 'Meat', '189,90', '0,32'],
['Morocco', 'Meat', '189,38', '0,44'],
['Morocco', 'Meat', '188,94', '0,60'],
['Morocco', 'Meat', '188,49', '0,78'],
['Spain', 'Meat', '189,56', '0,35'],
['Spain', 'Meat', '189,01', '0,40'],
['Spain', 'Meat', '188,13', '0,75'],
['Spain', 'Meat', '187,95', '0,85'],
['Italy', 'Meat', '189,32', '0,45'],
['Italy', 'Meat', '188,61', '0,67'],
['Italy', 'Meat', '188,01', '0,72']]
请注意,您的代码存在问题,因为 countries
的元素顺序不一定与 my_list
中的国家/地区顺序相同。在处理列表时处理国家更容易,在国家名称更改时记下。然后,您可以在循环中添加一个标志,指示该国家/地区的处理已完成(当当前值小于先前值时),如果是这样,则忽略该国家/地区的剩余值:
# Convert values to float.
minimal_values = [float(i.replace(',', '.')) for i in minimal_values]
maximal_values = [float(i.replace(',', '.')) for i in maximal_values]
# Collect all unique countries in a list.
results = []
finished_country = -1
country_index = -1
last_country = ''
for l in my_list:
country = l[0]
if country != last_country:
country_index += 1
last_country = country
value = float(l[-1].replace(',', '.'))
if finished_country == country_index or value < minimal_values[country_index]:
last_value = 0
continue
if value < last_value:
finished_country = country_index
elif value <= maximal_values[country_index]:
new_index_2 = price - float(l[-2].replace(',', '.'))
l[-2] = new_index_2
results.append(l)
last_value = value
示例数据的输出:
[
['Morocco', 'Meat', 310.1, '0,32'],
['Morocco', 'Meat', 310.62, '0,44'],
['Morocco', 'Meat', 311.06, '0,60'],
['Morocco', 'Meat', 311.51, '0,78'],
['Spain', 'Meat', 310.44, '0,35'],
['Spain', 'Meat', 310.99, '0,40'],
['Spain', 'Meat', 311.87, '0,75'],
['Spain', 'Meat', 312.05, '0,85'],
['Italy', 'Meat', 310.68, '0,45'],
['Italy', 'Meat', 311.39, '0,67'],
['Italy', 'Meat', 311.99, '0,72']
]
pandas 解决方案:
import pandas as pd
import numpy as np
# create input dataframe
my_list = [
['Morocco', 'Meat', '190,00', '0,15'],
['Morocco', 'Meat', '189,90', '0,32'],
['Morocco', 'Meat', '189,38', '0,44'],
['Morocco', 'Meat', '188,94', '0,60'],
['Morocco', 'Meat', '188,49', '0,78'],
['Morocco', 'Meat', '187,99', '0,70'],
['Spain', 'Meat', '190,76', '0,10'],
['Spain', 'Meat', '190,16', '0,20'],
['Spain', 'Meat', '189,56', '0,35'],
['Spain', 'Meat', '189,01', '0,40'],
['Spain', 'Meat', '188,13', '0,75'],
['Spain', 'Meat', '187,95', '0,85'],
['Italy', 'Meat', '190,20', '0,11'],
['Italy', 'Meat', '190,10', '0,31'],
['Italy', 'Meat', '189,32', '0,45'],
['Italy', 'Meat', '188,61', '0,67'],
['Italy', 'Meat', '188,01', '0,72'],
['Italy', 'Meat', '187,36', '0,55']]
dfi = pd.DataFrame(my_list).applymap(lambda x: x.replace(',', '.'))
dfi[[2, 3]] = dfi[[2, 3]].astype(float)
print(dfi)
# 0 1 2 3
# 0 Morocco Meat 190.00 0.15
# 1 Morocco Meat 189.90 0.32
# 2 Morocco Meat 189.38 0.44
# 3 Morocco Meat 188.94 0.60
# 4 Morocco Meat 188.49 0.78
# 5 Morocco Meat 187.99 0.70
# 6 Spain Meat 190.76 0.10
# 7 Spain Meat 190.16 0.20
# 8 Spain Meat 189.56 0.35
# 9 Spain Meat 189.01 0.40
# 10 Spain Meat 188.13 0.75
# 11 Spain Meat 187.95 0.85
# 12 Italy Meat 190.20 0.11
# 13 Italy Meat 190.10 0.31
# 14 Italy Meat 189.32 0.45
# 15 Italy Meat 188.61 0.67
# 16 Italy Meat 188.01 0.72
# 17 Italy Meat 187.36 0.55
# create df_filter with contry and min_v, max_v
minimal_values = ['0,32', '0,35', '0,45']
maximal_values = ['0,78', '0,85', '0,72']
minimal_values = [float(i.replace(',', '.')) for i in minimal_values]
maximal_values = [float(i.replace(',', '.')) for i in maximal_values]
df_filter = pd.DataFrame(list(zip(dfi[0].unique().tolist(),
minimal_values,
maximal_values)))
df_filter.columns = [0, 'min_v', 'max_v']
print(df_filter)
# 0 min_v max_v
# 0 Morocco 0.32 0.78
# 1 Spain 0.35 0.85
# 2 Italy 0.45 0.72
# merge dfi and fi_filter
dfm = pd.merge(dfi, df_filter, on=0, how='left')
print(dfm)
# 0 1 2 3 min_v max_v
# 0 Morocco Meat 190.00 0.15 0.32 0.78
# 1 Morocco Meat 189.90 0.32 0.32 0.78
# 2 Morocco Meat 189.38 0.44 0.32 0.78
# 3 Morocco Meat 188.94 0.60 0.32 0.78
# 4 Morocco Meat 188.49 0.78 0.32 0.78
# 5 Morocco Meat 187.99 0.70 0.32 0.78
# 6 Spain Meat 190.76 0.10 0.35 0.85
# 7 Spain Meat 190.16 0.20 0.35 0.85
# 8 Spain Meat 189.56 0.35 0.35 0.85
# 9 Spain Meat 189.01 0.40 0.35 0.85
# 10 Spain Meat 188.13 0.75 0.35 0.85
# 11 Spain Meat 187.95 0.85 0.35 0.85
# 12 Italy Meat 190.20 0.11 0.45 0.72
# 13 Italy Meat 190.10 0.31 0.45 0.72
# 14 Italy Meat 189.32 0.45 0.45 0.72
# 15 Italy Meat 188.61 0.67 0.45 0.72
# 16 Italy Meat 188.01 0.72 0.45 0.72
# 17 Italy Meat 187.36 0.55 0.45 0.72
# filter min_v <= column 3 <= max_v
cond = dfm[3].ge(dfm.min_v) & dfm[3].le(dfm.max_v)
dfm = dfm[cond].copy()
# filter 3 that is not ascending
cond = dfm.groupby(0)[3].diff() < 0
dfo = dfm.loc[~cond, [0,1,2,3]].reset_index(drop=True)
# outut result
price = 500
dfo[2] = price - dfo[2]
print(dfo)
# 0 1 2 3
# 0 Morocco Meat 310.10 0.32
# 1 Morocco Meat 310.62 0.44
# 2 Morocco Meat 311.06 0.60
# 3 Morocco Meat 311.51 0.78
# 4 Spain Meat 310.44 0.35
# 5 Spain Meat 310.99 0.40
# 6 Spain Meat 311.87 0.75
# 7 Spain Meat 312.05 0.85
# 8 Italy Meat 310.68 0.45
# 9 Italy Meat 311.39 0.67
# 10 Italy Meat 311.99 0.72
给定:
minimal_values = ['0,32', '0,35', '0,45']
maximal_values = ['0,78', '0,85', '0,72']
my_list = [
['Morocco', 'Meat', '190,00', '0,15'],
['Morocco', 'Meat', '189,90', '0,32'],
['Morocco', 'Meat', '189,38', '0,44'],
['Morocco', 'Meat', '188,94', '0,60'],
['Morocco', 'Meat', '188,49', '0,78'],
['Morocco', 'Meat', '187,99', '0,70'],
['Spain', 'Meat', '190,76', '0,10'],
['Spain', 'Meat', '190,16', '0,20'],
['Spain', 'Meat', '189,56', '0,35'],
['Spain', 'Meat', '189,01', '0,40'],
['Spain', 'Meat', '188,13', '0,75'],
['Spain', 'Meat', '187,95', '0,85'],
['Italy', 'Meat', '190,20', '0,11'],
['Italy', 'Meat', '190,10', '0,31'],
['Italy', 'Meat', '189,32', '0,45'],
['Italy', 'Meat', '188,61', '0,67'],
['Italy', 'Meat', '188,01', '0,72'],
['Italy', 'Meat', '187,36', '0,55']]
首先,由于我们将大量使用它,所以让我们编写一个小的转换例程来标准化我们在您的情况下 'float' 的含义:
def conv(s):
try:
return float(s.replace(',','.'))
except ValueError:
return s
现在看来,您的两个字符串列表 minimal_values
和 maximal_values
是按国家/地区映射到最小值和最大值。如果是这样,您对 countries = list(set(country[0] for country in my_list))
的使用将不起作用,因为集合在 Python.
如果你有 Python 3.6+,你可以:
countries = list({}.fromkeys(country[0] for country in my_list))
因为字典在 Python 3.6+ 中保留了插入顺序。假设您想要适用于所有版本的 Python,您可以改为:
def uniqs_in_order(li):
seen=set()
return [e for e in li if not (e in seen or seen.add(e))]
# Python 3.6+: return list({}.fromkeys(li))
现在您可以为该国家/地区创建 min/max 值的 country:tuple 映射:
mapping={k:(min_, max_) for k,min_,max_ in
zip(uniqs_in_order([sl[0] for sl in my_list]),
[conv(s) for s in minimal_values],
[conv(s) for s in maximal_values])}
>>> mapping
{'Morocco': (0.32, 0.78), 'Spain': (0.35, 0.85), 'Italy': (0.45, 0.72)}
现在,我们终于可以过滤了。由于您只想采用以下值:
- 在国家/地区的最小值和最大值内,并且;
- 当国家/地区的值不再上升时停止。
我们可以使用 itertools 中的 groupby
来按国家划分列表列表并执行这两个测试:
from itertools import groupby
filt=[]
price = 500
for k,v in groupby(my_list, key=lambda sl: sl[0]):
section=list(v)
for i, row in enumerate(section):
if i and conv(row[-1])<conv(section[i-1][-1]):
break
if mapping[row[0]][0]<=conv(row[-1])<=mapping[row[0]][1]:
row[-2]=price-conv(row[-2])
filt.append(row)
>>> filt
[['Morocco', 'Meat', 310.1, '0,32'],
['Morocco', 'Meat', 310.62, '0,44'],
['Morocco', 'Meat', 311.06, '0,60'],
['Morocco', 'Meat', 311.51, '0,78'],
['Spain', 'Meat', 310.44, '0,35'],
['Spain', 'Meat', 310.99, '0,40'],
['Spain', 'Meat', 311.87, '0,75'],
['Spain', 'Meat', 312.05, '0,85'],
['Italy', 'Meat', 310.68, '0,45'],
['Italy', 'Meat', 311.39, '0,67'],
['Italy', 'Meat', 311.99, '0,72']]