如何使用 linux cli 忽略 csv 文件中的任何特定列数据?
How to ignore any particular column data from csv file using linux cli?
我有 9 列,如 c1 c2 c3 c4 c5 c6 c7 c8 c9
,我想获取 c1 c2 c3 c4 c5 and c9
的值。
列包含以下 CSV 格式的数据。我如何通过 CLI 在 Linux 中执行此操作?请帮助
示例数据
123,B006195,T,O,INDIVIDUAL,25^5820200^,2018-04-25,13,NEW
12,C06195,T,O,INDIVIDUAL,25^5820200^,2018-04-25,13,NEW
12345,B00619,T,O,IND,25^5820200^,2018-04-25,13,OLD
我试过用cat file.csv | awk '{print ,,,,}' > newfile
我不确定你所说的 cat the value of c1 c2 c3 c4 c5 and c9
是什么意思,但是如果你只想过滤那些列,那么你可以使用以下 awk
命令:
awk 'BEGIN{OFS=FS=","}{print ,,,,,}' sample.csv
输入:
more sample.csv
c1,c2,c3,c4,c5,c6,c7,c8,c9
123,B006195,T,O,INDIVIDUAL,25^5820200^,2018-04-25,13,NEW
12,C06195,T,O,INDIVIDUAL,25^5820200^,2018-04-25,13,NEW
12345,B00619,T,O,IND,25^5820200^,2018-04-25,13,OLD
输出:
awk 'BEGIN{OFS=FS=","}{print ,,,,,}' sample.csv
c1,c2,c3,c4,c5,c9
123,B006195,T,O,INDIVIDUAL,NEW
12,C06195,T,O,INDIVIDUAL,NEW
12345,B00619,T,O,IND,OLD
说明:
您将 ,
(BEGIN{OFS=FS=","}
) 定义为字段分隔符(输入,输出),然后您只需在重定向后为每一行打印需要显示的列 {print ,,,,,}
输出到新的 csv 文件
如果你认为 awk
对这个任务来说太过分了,那么你也可以只使用 cut
命令(-d','
是定义一个 ,
作为分隔符而-f...
是指定需要保留的字段):
$ cut -d',' -f1,2,3,4,5,9 sample.csv
c1,c2,c3,c4,c5,c9
123,B006195,T,O,INDIVIDUAL,NEW
12,C06195,T,O,INDIVIDUAL,NEW
12345,B00619,T,O,IND,OLD
以下解决方案可能对您有所帮助,您需要在名为 fields
的 awk
变量中提供字段编号并打印它。
awk -F, -v fields="1,2,3,4,5,9" 'BEGIN{num=split(fields, array,",")} {for(i=1;i<=num;i++){printf("%s%s",$array[i],i==num?ORS:OFS)}}' OFS=, Input_file
现在也添加了一种非线性形式的解决方案。
awk -F, -v fields="1,2,3,4,5,9" '
BEGIN{
num=split(fields, array,",")}
{
for(i=1;i<=num;i++){
printf("%s%s",$array[i],i==num?ORS:OFS)}}
' OFS=, Input_file
上面代码的解释:
awk -F, -v fields="1,2,3,4,5,9" ' ##Setting field seprator as comma here with -F. Setting variable named fields with values of fields which we need.
BEGIN{ ##Starting BEGIN section here for awk which will be executed before reading the Input_file.
num=split(fields, array,",")} ##using split to split the variable fields into array named array and creating variable num which will have number of element of array.
{
for(i=1;i<=num;i++){ ##Starting a for loop here which starts from variable named i value from 1 to till value of variable num.
printf("%s%s",$array[i],i==num?ORS:OFS)}} ##Printing value of array[i] and then $array[i] will print the field value in current line too. Then checking condition variable i value equal to variable num then print new line else print space with OFS.
' OFS=, Input_file ##Mentioning the Input_file name here.
我有 9 列,如 c1 c2 c3 c4 c5 c6 c7 c8 c9
,我想获取 c1 c2 c3 c4 c5 and c9
的值。
列包含以下 CSV 格式的数据。我如何通过 CLI 在 Linux 中执行此操作?请帮助
示例数据
123,B006195,T,O,INDIVIDUAL,25^5820200^,2018-04-25,13,NEW
12,C06195,T,O,INDIVIDUAL,25^5820200^,2018-04-25,13,NEW
12345,B00619,T,O,IND,25^5820200^,2018-04-25,13,OLD
我试过用cat file.csv | awk '{print ,,,,}' > newfile
我不确定你所说的 cat the value of c1 c2 c3 c4 c5 and c9
是什么意思,但是如果你只想过滤那些列,那么你可以使用以下 awk
命令:
awk 'BEGIN{OFS=FS=","}{print ,,,,,}' sample.csv
输入:
more sample.csv
c1,c2,c3,c4,c5,c6,c7,c8,c9
123,B006195,T,O,INDIVIDUAL,25^5820200^,2018-04-25,13,NEW
12,C06195,T,O,INDIVIDUAL,25^5820200^,2018-04-25,13,NEW
12345,B00619,T,O,IND,25^5820200^,2018-04-25,13,OLD
输出:
awk 'BEGIN{OFS=FS=","}{print ,,,,,}' sample.csv
c1,c2,c3,c4,c5,c9
123,B006195,T,O,INDIVIDUAL,NEW
12,C06195,T,O,INDIVIDUAL,NEW
12345,B00619,T,O,IND,OLD
说明:
您将 ,
(BEGIN{OFS=FS=","}
) 定义为字段分隔符(输入,输出),然后您只需在重定向后为每一行打印需要显示的列 {print ,,,,,}
输出到新的 csv 文件
如果你认为 awk
对这个任务来说太过分了,那么你也可以只使用 cut
命令(-d','
是定义一个 ,
作为分隔符而-f...
是指定需要保留的字段):
$ cut -d',' -f1,2,3,4,5,9 sample.csv
c1,c2,c3,c4,c5,c9
123,B006195,T,O,INDIVIDUAL,NEW
12,C06195,T,O,INDIVIDUAL,NEW
12345,B00619,T,O,IND,OLD
以下解决方案可能对您有所帮助,您需要在名为 fields
的 awk
变量中提供字段编号并打印它。
awk -F, -v fields="1,2,3,4,5,9" 'BEGIN{num=split(fields, array,",")} {for(i=1;i<=num;i++){printf("%s%s",$array[i],i==num?ORS:OFS)}}' OFS=, Input_file
现在也添加了一种非线性形式的解决方案。
awk -F, -v fields="1,2,3,4,5,9" '
BEGIN{
num=split(fields, array,",")}
{
for(i=1;i<=num;i++){
printf("%s%s",$array[i],i==num?ORS:OFS)}}
' OFS=, Input_file
上面代码的解释:
awk -F, -v fields="1,2,3,4,5,9" ' ##Setting field seprator as comma here with -F. Setting variable named fields with values of fields which we need.
BEGIN{ ##Starting BEGIN section here for awk which will be executed before reading the Input_file.
num=split(fields, array,",")} ##using split to split the variable fields into array named array and creating variable num which will have number of element of array.
{
for(i=1;i<=num;i++){ ##Starting a for loop here which starts from variable named i value from 1 to till value of variable num.
printf("%s%s",$array[i],i==num?ORS:OFS)}} ##Printing value of array[i] and then $array[i] will print the field value in current line too. Then checking condition variable i value equal to variable num then print new line else print space with OFS.
' OFS=, Input_file ##Mentioning the Input_file name here.