awk 只适用于复制的数据,为什么?
awk only works on copied data, why?
我有一个稍微简单的 awk 用于此处描述的目的:
将多个 header 信息字段附加到文件,直到找到下一个 header
awk 仅在我 copy/paste 将数据放入新文件后才对数据起作用。例如,如果我将 head 的输出定向到一个新文件中,awk 仍然不起作用。 awk 仅在我 copy/paste 将文件放入新文件时才有效。
`head -40 file.csv > output.csv`
这是 awk:
`awk -F, '/"Serial No."/ {sn = }
/"Location:"/ {loc = }
/"([0-9]{1,2}\/){2}[0-9]{4} [0-9]{2}:[0-9]{2}"/
{[=11=] = loc FS sn FS [=11=]}1' file.csv>master1.csv`
如果我 copy/paste 数据并将其与原始数据进行比较,输出表明每一行都有差异,但没有说明在哪里。如果您查看 head 输出和 copy/paste 文件之间的差异,您会得到:
`diff trap4_top.csv trap4_again.csv'
:
< 1,25c1,24
< "Serial No.","0700000036022821"
< "Location:","LS_trap_2c"
< "High temperature limit (�C)",20
< "Low temperature limit (�C)",0
< "Date - Time","Temperature (�C)"
< "5/28/2015 08:00",24.0
< "5/28/2015 10:00",29.5
< "5/28/2015 12:00",28.0
< "5/28/2015 14:00",28.5
< "5/28/2015 16:00",27.0
< "5/28/2015 18:00",24.5
< "5/28/2015 20:00",23.0
< "5/28/2015 22:00",22.5
< "5/29/2015 00:00",21.5
< "5/29/2015 02:00",21.0
< "5/29/2015 04:00",20.0
< "5/29/2015 06:00",20.0
< "5/29/2015 08:00",24.5
< "5/29/2015 10:00",26.0
< "5/29/2015 12:00",27.5
< "5/29/2015 14:00",30.0
< "5/29/2015 16:00",29.0
< "5/29/2015 18:00",25.5
< "5/29/2015 20:00",23.5
< "5/29/2015 22:00",23.0
---
> "Serial No.","0700000036022821"
> "Location:","LS_trap_2c"
> "High temperature limit (°C)",20
> "Low temperature limit (°C)",0
> "Date - Time","Temperature (°C)"
> "5/28/2015 08:00",24.0
> "5/28/2015 10:00",29.5
> "5/28/2015 12:00",28.0
> "5/28/2015 14:00",28.5
> "5/28/2015 16:00",27.0
> "5/28/2015 18:00",24.5
> "5/28/2015 20:00",23.0
> "5/28/2015 22:00",22.5
> "5/29/2015 00:00",21.5
> "5/29/2015 02:00",21.0
> "5/29/2015 04:00",20.0
> "5/29/2015 06:00",20.0
> "5/29/2015 08:00",24.5
> "5/29/2015 10:00",26.0
> "5/29/2015 12:00",27.5
> "5/29/2015 14:00",30.0
> "5/29/2015 16:00",29.0
> "5/29/2015 18:00",25.5
> "5/29/2015 20:00",23.5`
我在 diff 中看到了特殊字符,但我不知道它们是否涉及,也不知道如何删除它们,除了 copy/paste 到目前为止。
head trap4.csv | cat -vte
给出:
"Serial No.","0700000036022821"^M$
"Location:","LS_trap_2c"^M$
"High temperature limit (M-0C)",20^M$
"Low temperature limit (M-0C)",0^M$
"Date - Time","Temperature (M-0C)"^M$
"5/28/2015 08:00",24.0^M$
"5/28/2015 10:00",29.5^M$
"5/28/2015 12:00",28.0^M$
"5/28/2015 14:00",28.5^M$
"5/28/2015 16:00",27.0^M$
好的,因为我怀疑您的输入文件有 DOS 行结尾,即 \r
或 ^M
(如上所示)。
您应该通过 运行:
将您的输入文件转换为 unix 行结尾
dos2unix file.csv
否则你可以这样做:
head -40 file.csv | sed 's/\r//' | awk ...
我有一个稍微简单的 awk 用于此处描述的目的:
将多个 header 信息字段附加到文件,直到找到下一个 header
awk 仅在我 copy/paste 将数据放入新文件后才对数据起作用。例如,如果我将 head 的输出定向到一个新文件中,awk 仍然不起作用。 awk 仅在我 copy/paste 将文件放入新文件时才有效。
`head -40 file.csv > output.csv`
这是 awk:
`awk -F, '/"Serial No."/ {sn = }
/"Location:"/ {loc = }
/"([0-9]{1,2}\/){2}[0-9]{4} [0-9]{2}:[0-9]{2}"/
{[=11=] = loc FS sn FS [=11=]}1' file.csv>master1.csv`
如果我 copy/paste 数据并将其与原始数据进行比较,输出表明每一行都有差异,但没有说明在哪里。如果您查看 head 输出和 copy/paste 文件之间的差异,您会得到:
`diff trap4_top.csv trap4_again.csv'
:
< 1,25c1,24
< "Serial No.","0700000036022821"
< "Location:","LS_trap_2c"
< "High temperature limit (�C)",20
< "Low temperature limit (�C)",0
< "Date - Time","Temperature (�C)"
< "5/28/2015 08:00",24.0
< "5/28/2015 10:00",29.5
< "5/28/2015 12:00",28.0
< "5/28/2015 14:00",28.5
< "5/28/2015 16:00",27.0
< "5/28/2015 18:00",24.5
< "5/28/2015 20:00",23.0
< "5/28/2015 22:00",22.5
< "5/29/2015 00:00",21.5
< "5/29/2015 02:00",21.0
< "5/29/2015 04:00",20.0
< "5/29/2015 06:00",20.0
< "5/29/2015 08:00",24.5
< "5/29/2015 10:00",26.0
< "5/29/2015 12:00",27.5
< "5/29/2015 14:00",30.0
< "5/29/2015 16:00",29.0
< "5/29/2015 18:00",25.5
< "5/29/2015 20:00",23.5
< "5/29/2015 22:00",23.0
---
> "Serial No.","0700000036022821"
> "Location:","LS_trap_2c"
> "High temperature limit (°C)",20
> "Low temperature limit (°C)",0
> "Date - Time","Temperature (°C)"
> "5/28/2015 08:00",24.0
> "5/28/2015 10:00",29.5
> "5/28/2015 12:00",28.0
> "5/28/2015 14:00",28.5
> "5/28/2015 16:00",27.0
> "5/28/2015 18:00",24.5
> "5/28/2015 20:00",23.0
> "5/28/2015 22:00",22.5
> "5/29/2015 00:00",21.5
> "5/29/2015 02:00",21.0
> "5/29/2015 04:00",20.0
> "5/29/2015 06:00",20.0
> "5/29/2015 08:00",24.5
> "5/29/2015 10:00",26.0
> "5/29/2015 12:00",27.5
> "5/29/2015 14:00",30.0
> "5/29/2015 16:00",29.0
> "5/29/2015 18:00",25.5
> "5/29/2015 20:00",23.5`
我在 diff 中看到了特殊字符,但我不知道它们是否涉及,也不知道如何删除它们,除了 copy/paste 到目前为止。
head trap4.csv | cat -vte
给出:
"Serial No.","0700000036022821"^M$
"Location:","LS_trap_2c"^M$
"High temperature limit (M-0C)",20^M$
"Low temperature limit (M-0C)",0^M$
"Date - Time","Temperature (M-0C)"^M$
"5/28/2015 08:00",24.0^M$
"5/28/2015 10:00",29.5^M$
"5/28/2015 12:00",28.0^M$
"5/28/2015 14:00",28.5^M$
"5/28/2015 16:00",27.0^M$
好的,因为我怀疑您的输入文件有 DOS 行结尾,即 \r
或 ^M
(如上所示)。
您应该通过 运行:
将您的输入文件转换为 unix 行结尾dos2unix file.csv
否则你可以这样做:
head -40 file.csv | sed 's/\r//' | awk ...