Sed 反向引用

Question

我有一个文本文件 mountainList.txt，其中包含以下内容：

      Brasstown Bald, (summit),4784,feet,Union County
Rabun Bald, (summit),4696,feet,Rabun County
Dick's Knob, (summit),4620,feet,Rabun County
              Hightower Bald, (summit),4568,feet,Towns County
Wolfpen Ridge, (ridge high point),4561,feet,Towns and Union Counties 
     Blood Mountain, (summit),4458,feet,Union County
Tray Mountain, (summit), 4430,feet,Towns County
          Grassy Ridge, (ridge high point),4420,feet,Rabun County
Slaughter Mountain, (summit),4338,feet,Union County
Double Spring Knob, (summit),4280,feet,Rabun County
Coosa Bald, (summit),4280,feet,Union County

我正在尝试反向引用以获取山名和县（即 Brasstown Bald、联合县）。我有一个表达式，但它不能正常工作：

sed -E 's/(.+, )(.+),(\w+ Count[yies]+)//' mountainList.txt

它做了我想要的，但只针对第一行。有人可以解释为什么会这样吗？

Answer 1

使用你的命令，似乎有效

$ sed -E 's/(.+, )(.+),(\w+ Count[yies]+)//' mountainList.txt
      Brasstown Bald, Union County
Rabun Bald, Rabun County
Dick's Knob, Rabun County
              Hightower Bald, Towns County
Wolfpen Ridge, (ridge high point),4561,feet,Towns and Union Counties 
     Blood Mountain, Union County
Tray Mountain, (summit), Towns County
          Grassy Ridge, Rabun County
Slaughter Mountain, Union County
Double Spring Knob, Rabun County
Coosa Bald, Union County

我正在使用这个版本的 sed：

$ sed --version
sed (GNU sed) 4.4

Answer 2

这可能适合您 (GNU sed)：

sed -r 's/^\s*([^,]*),.*,.*,.*,(.*)\s*$/, /' file

捕获第一个和最后一个字段，使用 , 分隔符。

Answer 3

数据是结构化的，因此 awk 解决方案也适用：

$ awk -F, '{ sub(/^ */,"",); print ,"-",  }' input.txt
Brasstown Bald - Union County
Rabun Bald - Rabun County
Dick's Knob - Rabun County
Hightower Bald - Towns County
Wolfpen Ridge - Towns and Union Counties
Blood Mountain - Union County
Tray Mountain - Towns County
Grassy Ridge - Rabun County
Slaughter Mountain - Union County
Double Spring Knob - Rabun County
Coosa Bald - Union County

Sed 反向引用

Sed back referencing

backreference

sed