Lua - 如何分析 .csv 导出以显示最高、最低和平均值等

Lua - How to analyse a .csv export to show the highest, lowest and average values etc

使用 Lua,我正在下载一个 .csv 文件,然后获取第一行和最后一行以帮助我通过提供的开始和结束 date/times 直观地验证时间段。

我还想浏览这些值并创建各种变量,例如该期间报告的最高值、最低值和平均值。

.csv 格式如下..

created_at,entry_id,field1,field2,field3,field4,field5,field6,field7,field8
2021-04-16 20:18:11 UTC,6097,17.5,21.1,20,20,19.5,16.1,6.7,15.10
2021-04-16 20:48:11 UTC,6098,17.5,21.1,20,20,19.5,16.3,6.1,14.30
2021-04-16 21:18:11 UTC,6099,17.5,21.1,20,20,19.6,17.2,5.5,14.30
2021-04-16 21:48:11 UTC,6100,17.5,21,20,20,19.4,17.9,4.9,13.40
2021-04-16 22:18:11 UTC,6101,17.5,20.8,20,20,19.1,18.5,4.4,13.40
2021-04-16 22:48:11 UTC,6102,17.5,20.6,20,20,18.7,18.9,3.9,12.40
2021-04-16 23:18:11 UTC,6103,17.5,20.4,19.5,20,18.4,19.2,3.5,12.40

而我获取第一行和最后一行的代码如下

print("Part 1")
print("Start : check 2nd and last row of csv")
local ctr = 0
local i = 0
local csvfilename = "/home/pi/shared/feed12hr.csv"
local hFile = io.open(csvfilename, "r")
    for _ in io.lines(csvfilename) do ctr = ctr + 1 end
    print("......  Count : Number of lines downloaded = " ..ctr)

local linenumbera = 2
local linenumberb = ctr
    for line in io.lines(csvfilename) do i = i + 1
        if i == linenumbera then 
            secondline = line
            print("......  2nd Line is = " ..secondline) end
        if i == linenumberb then 
            lastline = line
            print("......  Last line is = " ..lastline)
     -- return line
  end
end
print("End : Extracted 2nd and last row of csv")

但我现在计划选择一个列,最好是按名称(因为我希望能够将其用于具有类似结构的其他 .csv 导出。)并将 .csv 放入 table/array...

我在这里找到了一个选项 - Csv file to a Lua table and access the lines as new table or function()

见下文..


#!/usr/bin/lua

print("Part 2") 
print("Start : Convert .csv to table")

local csvfilename = "/home/pi/shared/feed12hr.csv"
local csv = io.open(csvfilename, "r")

local items = {}                      -- Store our values here
local headers = {}                    -- 
local first = true
for line in csv:gmatch("[^\n]+") do
  if first then                       -- this is to handle the first line and capture our headers.
    local count = 1
    for header in line:gmatch("[^,]+") do 
      headers[count] = header
      count = count + 1
    end
    first = false                     -- set first to false to switch off the header block
  else
    local name
    local i = 2                       -- We start at 2 because we wont be increment for the header
    for field in line:gmatch("[^,]+") do
      name = name or field            -- check if we know the name of our row
      if items[name] then             -- if the name is already in the items table then this is a field
        items[name][headers[i]] = field -- assign our value at the header in the table with the given name.
        i = i + 1
      else                            -- if the name is not in the table we create a new index for it
        items[name] = {}
      end
    end
  end
end

print("End : .csv now in table/array structure")

但是我收到以下错误消息 ??

pi@raspberrypi:/ $ lua home/pi/Documents/csv_to_table.lua
Part 2
Start : Convert .csv to table
lua: home/pi/Documents/csv_to_table.lua:12: attempt to call method 'gmatch' (a nil value)
stack traceback:
        home/pi/Documents/csv_to_table.lua:12: in main chunk
        [C]: ?
pi@raspberrypi:/ $

有什么想法吗? 我可以确认 .csv 文件存在吗?

一旦一切(希望)都在 table - 然后我希望能够根据所选列中的信息生成一个变量列表,然后我可以在推送中使用和发送通知或电子邮件(我已经有代码)。

以下是我到目前为止能够创建的内容,但我很感激 any/all 帮助对所选列中的值进行更多分析,这样我就可以看到所有东西,比如得到最高值,最低、平均等

print("Part 3") 
print("Start : Create .csv analysis values/variables")

local total = 0 
local count = 0
for name, item in pairs(items) do
  for field, value in pairs(item) do
      if field == "cabin" then
    print(field .. " = ".. value)
    total = total + value
    count = count + 1
  end
end
end

local average = tonumber(total/count)
local roundupdown = math.floor(average * 100)/100
print(count)
print(total)
print(total/count)
print(rounddown)

print("End : analysis values/variables created")

io.open returns 文件句柄成功。不是字符串。

因此

local csv = io.open(csvfilename, "r")
--...
  for line in csv:gmatch("[^\n]+") do
--...

会报错。

您需要先将文件读入字符串。

或者可以使用 file:lines(...)io.lines 遍历文件的行,就像您在代码中所做的那样。

local csv = io.open(csvfilename, "r")
if csv then
  for line in csv:lines() do
-- ...

您比需要更频繁地迭代文件。

编辑:

这就是您在动态计算每一行的最大值时填充数据 table 的方法。这假设你总是有有效的行!正确的解决方案应该验证数据。

-- prepare a table to store the minima and maxima in
local colExtrema = {min = {}, max = {}}
local rows = {}
-- go over the file linewise
for line in csvFile:lines() do
  -- split the line into 3 parts
  local timeStamp, id, dataStr = line:match("([^,]+),(%d+),(.*)")
  -- create a row container
  local row = {timeStamp = timeStamp, id = id, data = {}}
  -- fill the row data
  for val in dataStr:gmatch("[%d%.]+") do
    table.insert(row.data, val)
    -- find the biggest value so far
    -- our initial value is the smallest number possible
    local oldMax = colExtrema[#row.data].max or -math.huge
    -- store the bigger value as the new maximum
    colExtrema.max[#row.data] = math.max(val, oldMax)
  end
  -- insert row data
  table.insert(rows, row)
end