如何使用 Python 中的 statistics.model() 读取 CSV 文件并提取其中出现频率最高的值？

Question

#!/bin/python3
import csv
import statistics
def read_cvs():
        with open('hw_25000.csv', 'r') as csv_rf:
                cvs_reader = csv.DictReader(csv_rf)
                for line in cvs_reader:
                        print(line[' "Height(Inches)"'])
read_cvs()

我有这段代码可以读取我的文件并打印出我的身高值，但我不确定如何使用 statistics.mode() 打印出最常见的身高值。

CSV 文件位于 https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html

Answer 1

#试试这个

print(statistics.mode(line[' "Height(Inches)"']))

Answer 2

该文件中的 header 在每个列名称的文本前包含一个额外的 space。这可以通过使用 skipinitialspace=True 选项来删除。此外，CSV reader 会将所有内容作为字符串读取，因此值需要转换为浮点数。

尝试以下方法：

import csv
import statistics

def read_cvs():
    heights = []
    
    with open('hw_25000.csv', 'r') as csv_rf:
        cvs_reader = csv.DictReader(csv_rf, skipinitialspace=True)
        
        for line in cvs_reader:
            heights.append(float(line['Height(Inches)']))

    print(statistics.mode(heights))
    
read_cvs()

对于您的示例 CSV 文件，这给出了：

70.04724

一个较短的版本是：

def read_cvs():
    with open('hw_25000.csv', 'r') as csv_rf:
        cvs_reader = csv.DictReader(csv_rf, skipinitialspace=True)
        print(statistics.mode(float(line['Height(Inches)']) for line in cvs_reader))

如何使用 Python 中的 statistics.model() 读取 CSV 文件并提取其中出现频率最高的值？

How to read a CSV file and extract the most frequent value in it using statistics.model() in Python?

python

csv

python-3.x