打印 txt.file 每年的平均值
print the average of each year from txt.file
我需要打印作业的每年平均值。我有以下内容:
一个超过 2000 行的文本文件:
Unit 42;2017;7.0
Love Your Garden;2011;8.0
Limmy's Show;2010;8.3
Nazi Megastructures;2013;8.0
Omniscient;2020;6.3
Green Frontier;2019;7.4
Los Briceño;2019;8.4
Aftermath;2014;
Sugar;2006;
Beyond Stranger Things;2017;
Men on a Mission;2018;
Click for Murder;2017;
如您所见,有些电影没有评分,因此需要忽略这些电影
现在我需要像这样输出它:
2000: 1,1111
2001: 2,2222
etc up until 2020
现在我编写了以下代码来从 txt 文件中提取正确的部分
我尝试了以下方法:
file = open("tv_shows.txt", "r", encoding='utf8')
#content = file.read()
result = {}
for line in file:
year, number = line.split(';')[1], line.split(';')[2]
if len(number) <3:
continue
year = int(year)
number = float(number)
try:
result[year].append(number)
except KeyError:
result[year] = [number]
for k, v in sorted(result.items()):
print('{}: {:.4f}'.format(k, sum(v) / len(v)))
它给了我这个,好多了,但现在它对我提出了一个新问题。我怎样才能删除平均数中多余的零。
2000: 7.7000
2001: 7.4000
2002: 7.1000
2003: 7.0091
2004: 7.6667
2005: 7.7333
2006: 7.2579
2007: 7.5080
2008: 7.1630
2009: 7.3884
2010: 7.3904
2011: 7.3507
2012: 7.0787
2013: 7.0418
2014: 7.2427
2015: 7.2462
2016: 7.1730
2017: 7.1478
2018: 7.0034
2019: 7.1191
2020: 6.8130
如果您被允许使用 pandas 那么
df = pd.read_csv("tv_show.txt", delimiter=";", header=None,
names=['name', 'year', 'rating'])
df = df.dropna()
df.groupby(['year'])['rating'].mean().reset_index()
你保留一本字典,它的键是年份,值是那一年的分数列表怎么样?在循环时填充字典(不要忘记将 str 转换为 float)。然后最后你可以平均每个列表。
如果您不允许使用pandas
,
file = open("tv_shows.txt", "r", encoding='utf8')
years = {}
for a in file:
_, year, number = a.split(';')
if len(number) <3:
continue
year = int(year)
number = float(number)
if year not in years:
years[year] = [] # Add a new list to the years dict
years[year].append(number) # Append the current number to the correct list.
avgyears = {}
for year, numberlist in years.items():
# iterate over the dict, find the mean of each list
avgyears[year] = sum(numberlist) / len(numberlist)
我在写答案时编辑了问题。修改后的问题是“如何删除平均数中多余的零?”
添加额外的零是因为您要求 Python 将您的数字格式化为小数点后四位。要从字符串的右侧删除零,您可以简单地使用 str.rstrip()
for year, numberlist in years.items():
# iterate over the dict, find the mean of each list
avgyears[year] = sum(numberlist) / len(numberlist)
num = f"{avgyears[year]:.4f}".rstrip("0")
print(f"{year}: {num}")
我需要打印作业的每年平均值。我有以下内容: 一个超过 2000 行的文本文件:
Unit 42;2017;7.0
Love Your Garden;2011;8.0
Limmy's Show;2010;8.3
Nazi Megastructures;2013;8.0
Omniscient;2020;6.3
Green Frontier;2019;7.4
Los Briceño;2019;8.4
Aftermath;2014;
Sugar;2006;
Beyond Stranger Things;2017;
Men on a Mission;2018;
Click for Murder;2017;
如您所见,有些电影没有评分,因此需要忽略这些电影 现在我需要像这样输出它:
2000: 1,1111
2001: 2,2222
etc up until 2020
现在我编写了以下代码来从 txt 文件中提取正确的部分
我尝试了以下方法:
file = open("tv_shows.txt", "r", encoding='utf8')
#content = file.read()
result = {}
for line in file:
year, number = line.split(';')[1], line.split(';')[2]
if len(number) <3:
continue
year = int(year)
number = float(number)
try:
result[year].append(number)
except KeyError:
result[year] = [number]
for k, v in sorted(result.items()):
print('{}: {:.4f}'.format(k, sum(v) / len(v)))
它给了我这个,好多了,但现在它对我提出了一个新问题。我怎样才能删除平均数中多余的零。
2000: 7.7000
2001: 7.4000
2002: 7.1000
2003: 7.0091
2004: 7.6667
2005: 7.7333
2006: 7.2579
2007: 7.5080
2008: 7.1630
2009: 7.3884
2010: 7.3904
2011: 7.3507
2012: 7.0787
2013: 7.0418
2014: 7.2427
2015: 7.2462
2016: 7.1730
2017: 7.1478
2018: 7.0034
2019: 7.1191
2020: 6.8130
如果您被允许使用 pandas 那么
df = pd.read_csv("tv_show.txt", delimiter=";", header=None,
names=['name', 'year', 'rating'])
df = df.dropna()
df.groupby(['year'])['rating'].mean().reset_index()
你保留一本字典,它的键是年份,值是那一年的分数列表怎么样?在循环时填充字典(不要忘记将 str 转换为 float)。然后最后你可以平均每个列表。
如果您不允许使用pandas
,
file = open("tv_shows.txt", "r", encoding='utf8')
years = {}
for a in file:
_, year, number = a.split(';')
if len(number) <3:
continue
year = int(year)
number = float(number)
if year not in years:
years[year] = [] # Add a new list to the years dict
years[year].append(number) # Append the current number to the correct list.
avgyears = {}
for year, numberlist in years.items():
# iterate over the dict, find the mean of each list
avgyears[year] = sum(numberlist) / len(numberlist)
我在写答案时编辑了问题。修改后的问题是“如何删除平均数中多余的零?”
添加额外的零是因为您要求 Python 将您的数字格式化为小数点后四位。要从字符串的右侧删除零,您可以简单地使用 str.rstrip()
for year, numberlist in years.items():
# iterate over the dict, find the mean of each list
avgyears[year] = sum(numberlist) / len(numberlist)
num = f"{avgyears[year]:.4f}".rstrip("0")
print(f"{year}: {num}")