Atom 正则表达式：丢弃块周围的多行文本

Question

假设我有这个文本：

blah blah Bob Loblaw Law blah
keep1 { i want this } blop
blah blob keep2 { and
this too } blaw blat
etc...

我想以

结束

keep1 { i want this }
keep2 { and
this too }

或者也许：

keep1 { i want this }
keep2 { and this too }

我还没有想出如何让 Atom 的正则表达式 find/replace 机制丢弃特定匹配字符串外部多行的所有内容。提示？

更新：

在我尝试过的许多事情中，这让我最接近：

[\S\s]+?(keep\d\s+\{[\S\s]+?\})

这导致：

keep1 { i want this }
keep2 { and
this too }
 blaw blat
etc...

这可能已经足够好了——我可以编辑尾随的分片——但如果知道如何 trim 那些也会很有用。

Answer 1

使用

[\s\S]*?(keep\d\s+\{[^{}]*\})|(?:(?!keep\d\s+\{[^{}]*\})[\s\S])+$

参见proof。

解释

--------------------------------------------------------------------------------
  [\s\S]*?                 any character of: whitespace (\n, \r, \t,
                           \f, and " "), non-whitespace (all but \n,
                           \r, \t, \f, and " ") (0 or more times
                           (matching the least amount possible))
--------------------------------------------------------------------------------
  (                        group and capture to :
--------------------------------------------------------------------------------
    keep                     'keep'
--------------------------------------------------------------------------------
    \d                       digits (0-9)
--------------------------------------------------------------------------------
    \s+                      whitespace (\n, \r, \t, \f, and " ") (1
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \{                       '{'
--------------------------------------------------------------------------------
    [^{}]*                   any character except: '{', '}' (0 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \}                       '}'
--------------------------------------------------------------------------------
  )                        end of 
--------------------------------------------------------------------------------
 |                        OR
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (1 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
      keep                     'keep'
--------------------------------------------------------------------------------
      \d                       digits (0-9)
--------------------------------------------------------------------------------
      \s+                      whitespace (\n, \r, \t, \f, and " ")
                               (1 or more times (matching the most
                               amount possible))
--------------------------------------------------------------------------------
      \{                       '{'
--------------------------------------------------------------------------------
      [^{}]*                   any character except: '{', '}' (0 or
                               more times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
      \}                       '}'
--------------------------------------------------------------------------------
    )                        end of look-ahead
--------------------------------------------------------------------------------
    [\s\S]                   any character of: whitespace (\n, \r,
                             \t, \f, and " "), non-whitespace (all
                             but \n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
  )+                       end of grouping
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string

Answer 2

您可以在 Atom 中使用这个简单的正则表达式替换来完成此任务：

\b(keep\d+\s*{[^}]*})|.+?

替换为：</code> <a href="https://regex101.com/r/oTpV9g/1" rel="nofollow noreferrer">RegEx Demo</a> 正则表达式详细信息： <ul> <li><code>\b: 字边界

(keep\d+\s*{[^}]*})：在捕获组 #1 中匹配以 keep 开头的字符串，后跟 1+ 个数字，后跟 0+ 个空格，后跟 {...} 内的任何文本越界也是如此。这假设 { 和 } 是平衡的并且没有转义 { 和 }.

|：或

.+?: 延迟匹配 1+ 任何东西

PS：如果你想删除前导换行符，那么使用：

\n?\b(keep\d+\s*{[^}]*})|.+?

Atom 编辑器演示

替换前：

替换后：

Atom 正则表达式：丢弃块周围的多行文本

Atom regexp: discarding multiline text around blocks

regex

atom-editor

更新：

Atom 编辑器演示