Python，从 csv 中获取最小值、最大值和 95%

Question

我有一个来自 perfmon 的 .csv 文件。该文件有 6000 条记录，如下所示：

(PDH-CSV 4.0) (SA Pacific Standard Time)(300),"\server1\PhysicalDisk(_Total)\% Disk Read Time","\server1\PhysicalDisk(_Total)\% Disk Write Time"
10/30/2017 15:00:15.568," "," "
10/30/2017 15:00:30.530,"25.763655942362824","130.21748494987176"
10/30/2017 15:00:45.518,"25.591636684958058","135.81093813384427"

我需要从第 1 列和第 2 列中获取最小值、最大值和 95 个百分位数。但是，作为新手，我无法通过第一个挑战，即将每个值格式化为 int:

import csv
sum = 0
fila = 0

with open('datos_header.csv') as csvfile:
    leercsv = csv.reader(csvfile, delimiter = ',')
    csvfile.__next__()
    for col in leercsv:
        col1 = (col[1])
        subtot = float(col1 * 4)
#        fila = fila + 1
#        sum = col1 + float(col)

#tot = sum / fila
    print(subtot)

并得到：

Traceback (most recent call last): File "", line 10, in ValueError: could not convert string to float:

我试过： - 删除 header - 删除每一个 non-numeric like / or : 使用正则表达式的值 - 删除空白

话虽如此：

除了错误之外，您认为我在获取最小值、最大值和 95% 的方法上是否正确？
如果是这样，需要做什么才能按照我的代码将字符串转换为浮点数？
如果没有，你能帮忙吗？

谢谢！

Answer 1

您必须先检查字符串到浮点数的转换，然后您可以尝试：

for col in leercsv:
    col1 = (col[1])
    if col1: subtot = float(col1) * 4 # and convert to float before multiply

更强大的解决方案：

for col in leercsv:
    col1 = (col[1])
    try: subtot = float(col1) * 4
    except: pass

Python，从 csv 中获取最小值、最大值和 95%

Python, get min, max and 95 percentil from csv

python

csv

max

min