如何使用 linux cli 忽略 csv 文件中的任何特定列数据？

Question

我有 9 列，如 c1 c2 c3 c4 c5 c6 c7 c8 c9，我想获取 c1 c2 c3 c4 c5 and c9 的值。

列包含以下 CSV 格式的数据。我如何通过 CLI 在 Linux 中执行此操作？请帮助

示例数据

123,B006195,T,O,INDIVIDUAL,25^5820200^,2018-04-25,13,NEW
12,C06195,T,O,INDIVIDUAL,25^5820200^,2018-04-25,13,NEW
12345,B00619,T,O,IND,25^5820200^,2018-04-25,13,OLD

我试过用cat file.csv | awk '{print ,,,,}' > newfile

Answer 1

我不确定你所说的 cat the value of c1 c2 c3 c4 c5 and c9 是什么意思，但是如果你只想过滤那些列，那么你可以使用以下 awk 命令：

awk 'BEGIN{OFS=FS=","}{print ,,,,,}' sample.csv

输入：

more sample.csv 
c1,c2,c3,c4,c5,c6,c7,c8,c9
123,B006195,T,O,INDIVIDUAL,25^5820200^,2018-04-25,13,NEW
12,C06195,T,O,INDIVIDUAL,25^5820200^,2018-04-25,13,NEW
12345,B00619,T,O,IND,25^5820200^,2018-04-25,13,OLD

输出：

awk 'BEGIN{OFS=FS=","}{print ,,,,,}' sample.csv 
c1,c2,c3,c4,c5,c9
123,B006195,T,O,INDIVIDUAL,NEW
12,C06195,T,O,INDIVIDUAL,NEW
12345,B00619,T,O,IND,OLD

说明：

您将 , (BEGIN{OFS=FS=","}) 定义为字段分隔符（输入，输出），然后您只需在重定向后为每一行打印需要显示的列 {print ,,,,,}输出到新的 csv 文件

如果你认为 awk 对这个任务来说太过分了，那么你也可以只使用 cut 命令（-d',' 是定义一个 , 作为分隔符而-f...是指定需要保留的字段）：

$ cut -d',' -f1,2,3,4,5,9 sample.csv
c1,c2,c3,c4,c5,c9
123,B006195,T,O,INDIVIDUAL,NEW
12,C06195,T,O,INDIVIDUAL,NEW
12345,B00619,T,O,IND,OLD

Answer 2

以下解决方案可能对您有所帮助，您需要在名为 fields 的 awk 变量中提供字段编号并打印它。

awk -F, -v fields="1,2,3,4,5,9" 'BEGIN{num=split(fields, array,",")} {for(i=1;i<=num;i++){printf("%s%s",$array[i],i==num?ORS:OFS)}}' OFS=,   Input_file

现在也添加了一种非线性形式的解决方案。

awk -F, -v fields="1,2,3,4,5,9" '
BEGIN{
  num=split(fields, array,",")}
{
  for(i=1;i<=num;i++){
    printf("%s%s",$array[i],i==num?ORS:OFS)}}
' OFS=,   Input_file

上面代码的解释：

awk -F, -v fields="1,2,3,4,5,9" '              ##Setting field seprator as comma here with -F. Setting variable named fields with values of fields which we need.
BEGIN{                                         ##Starting BEGIN section here for awk which will be executed before reading the Input_file.
  num=split(fields, array,",")}                ##using split to split the variable fields into array named array and creating variable num which will have number of element of array.
{
  for(i=1;i<=num;i++){                         ##Starting a for loop here which starts from variable named i value from 1 to till value of variable num.
    printf("%s%s",$array[i],i==num?ORS:OFS)}}  ##Printing value of array[i] and then $array[i] will print the field value in current line too. Then checking condition variable i value equal to variable num then print new line else print space with OFS.
' OFS=,  Input_file                            ##Mentioning the Input_file name here.

如何使用 linux cli 忽略 csv 文件中的任何特定列数据？

How to ignore any particular column data from csv file using linux cli?

linux

sorting

awk

grep

cat