在可能不同的行上删除 2 个分隔符之间的字符串

Question

背景：我有一个配置文件，它以这种格式的变体存储值：

（以下是使用虚构数据的示例）

'names': { "john", "jeff", "stewie", "amy", "emily" }

一些格式细节：

'names' 和 :
“{”和 "john"
列表成员之间始终存在 space（"john" 始终在 "jeff")
"emily" 和“}”之间可能有也可能没有 space

此列表中的元素可以用线而不是 space。例如，这也是可以接受的：

'names': { "john",
           "jeff",
           "stewie",
           "amy",
           "emily"
         }

这也是：

    'names': { "john", "jeff", "stewie",
               "amy", "emily" }

我要创建的功能：我想从名为 'names' 的列表中删除 "amy"。

我一直在尝试使用 sed 创建此行为，但我愿意使用 bash、awk、cut 或它们的某种组合。

如果列表的元素在一行上，这会很容易：

/bin/sed -i "/names/ s/ ${element}//" $f

（其中 $element 包含 "amy"，$f 包含我正在编辑的文件

但是多行的可能性让我很困惑。

想法？

Answer 1

让我们考虑这个包含所有三种情况的输入文件：

$ cat file
'names': { "john", "jeff", "stewie", "amy", "emily" }
'names': { "john",
           "jeff",
           "stewie",
           "amy",
           "emily"
         }
'names': { "john", "jeff", "stewie",
               "amy", "emily" }

现在，让我们应用此 sed 命令删除 amy:

$ sed '/names/{:a;/}/!{N;b a}; s/"amy",[[:space:]]*//}' file
'names': { "john", "jeff", "stewie", "emily" }
'names': { "john",
           "jeff",
           "stewie",
           "emily"
         }
'names': { "john", "jeff", "stewie",
               "emily" }

工作原理

/names/

只要一行包含 names，我们就开始执行命令。其他行不变通过。
:a; /}/! {N;b a}

一旦我们有了包含 names 的行，我们就会读入其他行，直到我们得到包含右大括号的行。这会立即获得完整的 names 赋值，即使它分布在多行中。

更详细地说，:a是一个标签。 /}/! 是一个条件。如果该行不包含 }，则执行语句 N; b a。 N 读取下一行并将其添加到模式 space 中。 b a 跳转（分支）回到标签 a。因此，这一直持续到完整的赋值，从 names 到 }，在 space.
s/"amy",[[:space:]]*//}

使用 sed 模式 space 中的完整 names 赋值，我们寻找 "amy", 和后面的任何白色 space 并删除它们。

删除艾米，即使她在列表中排在最后

上述解决方案假定逗号跟在名称 amy 之后。但是，假设 amy 可能是列表中的姓氏，如以下文件所示：

$ cat file
'names': { "john", "jeff", "stewie", "emily", "amy" }
'names': { "john",
           "jeff",
           "stewie",
           "emily",
           "amy"
         }
'names': { "john", "jeff", "stewie",
               "emily", "amy"}

为了处理这种情况，我们需要添加一个替代命令：

$ sed '/names/{:a;/}/!{N;b a}; s/"amy",[[:space:]]*//; s/,[[:space:]]*"amy"//}' file
'names': { "john", "jeff", "stewie", "emily" }
'names': { "john",
           "jeff",
           "stewie",
           "emily"
         }
'names': { "john", "jeff", "stewie",
               "emily"}

Answer 2

使用 sed 如下：

sed  -r ':loop;$!{N;b loop};s/(.names.: ?\{[^}]*)"amy",? *([^}]*\})//g' my-file

Answer 3

为什么不直接使用 bash 字符串处理例程，直接复制自：

http://tldp.org/LDP/abs/html/string-manipulation.html

stringZ=abcABC123ABCabc
echo ${stringZ/abc/xyz}

结果 = bcABC123ABCxyz

你的情况

export stringZ="\'names\': \{ \"john\", \"jeff\", \"stewie\", \"amy\", \"emily\" }"

echo ${stringZ/\"amy\",/}

returns 'names': { "john", "jeff", "stewie", "emily" }

在可能不同的行上删除 2 个分隔符之间的字符串

Delete a string between 2 delimiters on possibly different lines

regex

bash

awk

sed

工作原理

删除艾米，即使她在列表中排在最后