Perl 解析以空格分隔的列
Perl To Parse Whitespace Separated Columns
我有一个大文本文件,其中包含三列,每列由四个空格分隔。我需要一个 perl 脚本来读取此文本文件并将列 #1 和 #2 输出到一个新的文本文件,其中每一列都用引号引起来并在输出文件中用逗号分隔。
包含四列的文本文件包含如下所示的数据:
9a2ba3c0580b5f3799ad9d6f487b2d3 /folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg HOST
我希望输出看起来像
"9a2ba3c0580b5f3799ad9d6f487b2d38","/folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg"
以下代码供您参考:
#!/usr/bin/perl
my $defaultFileName=defined $ARGV[0]?$ARGV[0]:"filename.txt";
die "Could not find file: $defaultFileName" unless(-f $defaultFileName);
open my $fh, '<',"textFileName.log";
foreach my $line(<$fh>) {
my @tmpData=split(/\s+/, $line);
printf "\"%s\",\"%s\"\n\n",$tmpData[1],$tmpData[2];
}
close $fh;
像单线一样简单:
perl -lane 'print join ",", map qq("$_"), @F[0, 1]'
-l
处理 print
中的换行符
-n
逐行读取输入
-a
将空白处的每一行拆分为 @F
数组
@F[0, 1]
是一个数组切片,它提取@F
数组 的前两个元素
map
将每个元素用双引号括起来
join
在 之间插入逗号
这也可以用awk
来完成
>>cat test
9a2ba3c0580b5f3799ad9d6f487b2d3 /folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg HOST
9a2ba3c0580b5f3799ad9d6f487b2d3 /folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg HOST
9a2ba3c0580b5f3799ad9d6f487b2d3 /folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg HOST
9a2ba3c0580b5f3799ad9d6f487b2d3 /folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg HOST
输出:
>>awk '{FS=" "}{print "\"""\",""\"""\",""\"""\"" }' test
"9a2ba3c0580b5f3799ad9d6f487b2d3","/folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg","HOST"
"9a2ba3c0580b5f3799ad9d6f487b2d3","/folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg","HOST"
"9a2ba3c0580b5f3799ad9d6f487b2d3","/folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg","HOST"
"9a2ba3c0580b5f3799ad9d6f487b2d3","/folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg","HOST"
>>awk '{FS=" "}{print "\"""\",""\"""\",""\"""\"" }' test > output.txt
然后 output.txt
将得到所需的输出。
我有一个大文本文件,其中包含三列,每列由四个空格分隔。我需要一个 perl 脚本来读取此文本文件并将列 #1 和 #2 输出到一个新的文本文件,其中每一列都用引号引起来并在输出文件中用逗号分隔。
包含四列的文本文件包含如下所示的数据:
9a2ba3c0580b5f3799ad9d6f487b2d3 /folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg HOST
我希望输出看起来像
"9a2ba3c0580b5f3799ad9d6f487b2d38","/folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg"
以下代码供您参考:
#!/usr/bin/perl
my $defaultFileName=defined $ARGV[0]?$ARGV[0]:"filename.txt";
die "Could not find file: $defaultFileName" unless(-f $defaultFileName);
open my $fh, '<',"textFileName.log";
foreach my $line(<$fh>) {
my @tmpData=split(/\s+/, $line);
printf "\"%s\",\"%s\"\n\n",$tmpData[1],$tmpData[2];
}
close $fh;
像单线一样简单:
perl -lane 'print join ",", map qq("$_"), @F[0, 1]'
-l
处理print
中的换行符
-n
逐行读取输入-a
将空白处的每一行拆分为@F
数组@F[0, 1]
是一个数组切片,它提取@F
数组 的前两个元素
map
将每个元素用双引号括起来join
在 之间插入逗号
这也可以用awk
>>cat test
9a2ba3c0580b5f3799ad9d6f487b2d3 /folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg HOST
9a2ba3c0580b5f3799ad9d6f487b2d3 /folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg HOST
9a2ba3c0580b5f3799ad9d6f487b2d3 /folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg HOST
9a2ba3c0580b5f3799ad9d6f487b2d3 /folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg HOST
输出:
>>awk '{FS=" "}{print "\"""\",""\"""\",""\"""\"" }' test
"9a2ba3c0580b5f3799ad9d6f487b2d3","/folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg","HOST"
"9a2ba3c0580b5f3799ad9d6f487b2d3","/folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg","HOST"
"9a2ba3c0580b5f3799ad9d6f487b2d3","/folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg","HOST"
"9a2ba3c0580b5f3799ad9d6f487b2d3","/folder1/folder2/folder3/folder4/folder5/folder6/folder7_name_PC/images/filename.jpg","HOST"
>>awk '{FS=" "}{print "\"""\",""\"""\",""\"""\"" }' test > output.txt
然后 output.txt
将得到所需的输出。