如何从 Python 中的此数据集中找到最大值或最小值?
How do I find the Max or Min value from this dataset in Python?
我正在使用全球预期寿命的在线数据集,我正在尝试在 life_expectancy 列中找到最大值和最小值。
这是数据集:https://ourworldindata.org/spanish-flu-largest-influenza-pandemic-in-history
这是我按照其他帖子中的建议尝试数学方程式和 max() 和 min() 后得到的结果。
with open('data/life-expectancy.csv') as life_expectancy:
next(life_expectancy)
for data in life_expectancy:
clean_data = data.strip()
split_data = clean_data.split(',')
entity = split_data[0]
code = split_data[1]
year = split_data[2]
expectancy = float(split_data[3])
print(f'The overall max life expectancy is: {max(split_data[3])}')
print(f'The overall min life expectancy is: {min(split_data[3])}')
我还应该添加什么才能真正获得正确的结果?
当前输出:
The overall max life expectancy is: 9
The overall min life expectancy is: .
您想创建在循环时建立起来的列表,然后在 min/max 之后。
with open('data/life-expectancy.csv') as life_expectancy:
next(life_expectancy)
entities = []
codes = []
years = []
expectancies = []
for data in life_expectancy:
clean_data = data.strip()
split_data = clean_data.split(',')
entities.append(split_data[0])
codes.append(split_data[1])
years.append(split_data[2])
expectancies.append(float(split_data[3]))
print(f'The overall max life expectancy is: {max(expectancies)}')
print(f'The overall min life expectancy is: {min(expectancies)}')
您没有对正在迭代的数据执行任何操作。
当您将数据存储在列表中时,我们可以在数据集上使用 min
和 max
。使用键和 lambda
我们可以确保我们的结果包括所有相关数据,而不是只存储最大值。
with open('life-expectancy.csv') as life_expectancy:
next(life_expectancy)
## Create an empty list
output = []
for data in life_expectancy:
clean_data = data.strip()
split_data = clean_data.split(',')
entity = split_data[0]
code = split_data[1]
year = split_data[2]
expectancy = float(split_data[3])
## Append to the list
output.append([entity, code, year, expectancy])
max_life = max(output, key=lambda x: x[3])
min_life = min(output, key=lambda x: x[3])
#['Monaco', 'MCO', '2019', 86.751]
#['Iceland', 'ISL', '1882', 17.76]
print(f'The overall max life expectancy is {max_life[3]} in {max_life[0]}')
print(f'The overall min life expectancy is {min_life[3]} in {min_life[0]}')
#The overall max life expectancy is 86.751 in Monaco
#The overall min life expectancy is 17.76 in Iceland
为了提高可读性,您可以通过修改以下行将数据存储为`dicts 列表
output.append({'entity': entity, 'code': code, 'year': year, 'expectancy': expectancy})
max_life = max(output, key=lambda x: x['expectancy'])
min_life = min(output, key=lambda x: x['expectancy'])
print(f'The overall max life expectancy is {max_life["expectancy"]} in {max_life["entity"]}')
print(f'The overall min life expectancy is {min_life["expectancy"]} in {min_life["entity"]}')
我正在使用全球预期寿命的在线数据集,我正在尝试在 life_expectancy 列中找到最大值和最小值。
这是数据集:https://ourworldindata.org/spanish-flu-largest-influenza-pandemic-in-history
这是我按照其他帖子中的建议尝试数学方程式和 max() 和 min() 后得到的结果。
with open('data/life-expectancy.csv') as life_expectancy:
next(life_expectancy)
for data in life_expectancy:
clean_data = data.strip()
split_data = clean_data.split(',')
entity = split_data[0]
code = split_data[1]
year = split_data[2]
expectancy = float(split_data[3])
print(f'The overall max life expectancy is: {max(split_data[3])}')
print(f'The overall min life expectancy is: {min(split_data[3])}')
我还应该添加什么才能真正获得正确的结果?
当前输出:
The overall max life expectancy is: 9
The overall min life expectancy is: .
您想创建在循环时建立起来的列表,然后在 min/max 之后。
with open('data/life-expectancy.csv') as life_expectancy:
next(life_expectancy)
entities = []
codes = []
years = []
expectancies = []
for data in life_expectancy:
clean_data = data.strip()
split_data = clean_data.split(',')
entities.append(split_data[0])
codes.append(split_data[1])
years.append(split_data[2])
expectancies.append(float(split_data[3]))
print(f'The overall max life expectancy is: {max(expectancies)}')
print(f'The overall min life expectancy is: {min(expectancies)}')
您没有对正在迭代的数据执行任何操作。
当您将数据存储在列表中时,我们可以在数据集上使用 min
和 max
。使用键和 lambda
我们可以确保我们的结果包括所有相关数据,而不是只存储最大值。
with open('life-expectancy.csv') as life_expectancy:
next(life_expectancy)
## Create an empty list
output = []
for data in life_expectancy:
clean_data = data.strip()
split_data = clean_data.split(',')
entity = split_data[0]
code = split_data[1]
year = split_data[2]
expectancy = float(split_data[3])
## Append to the list
output.append([entity, code, year, expectancy])
max_life = max(output, key=lambda x: x[3])
min_life = min(output, key=lambda x: x[3])
#['Monaco', 'MCO', '2019', 86.751]
#['Iceland', 'ISL', '1882', 17.76]
print(f'The overall max life expectancy is {max_life[3]} in {max_life[0]}')
print(f'The overall min life expectancy is {min_life[3]} in {min_life[0]}')
#The overall max life expectancy is 86.751 in Monaco
#The overall min life expectancy is 17.76 in Iceland
为了提高可读性,您可以通过修改以下行将数据存储为`dicts 列表
output.append({'entity': entity, 'code': code, 'year': year, 'expectancy': expectancy})
max_life = max(output, key=lambda x: x['expectancy'])
min_life = min(output, key=lambda x: x['expectancy'])
print(f'The overall max life expectancy is {max_life["expectancy"]} in {max_life["entity"]}')
print(f'The overall min life expectancy is {min_life["expectancy"]} in {min_life["entity"]}')