Atom 正则表达式:丢弃块周围的多行文本

Atom regexp: discarding multiline text around blocks

假设我有这个文本:

blah blah Bob Loblaw Law blah
keep1 { i want this } blop
blah blob keep2 { and
this too } blaw blat
etc...

我想以

结束
keep1 { i want this }
keep2 { and
this too }

或者也许:

keep1 { i want this }
keep2 { and this too }

我还没有想出如何让 Atom 的正则表达式 find/replace 机制丢弃特定匹配字符串 外部 多行的所有内容。提示?

更新:

在我尝试过的许多事情中,这让我最接近:

[\S\s]+?(keep\d\s+\{[\S\s]+?\})

这导致:

keep1 { i want this }
keep2 { and
this too }
 blaw blat
etc...

这可能已经足够好了——我可以编辑尾随的分片——但如果知道如何 trim 那些也会很有用。

使用

[\s\S]*?(keep\d\s+\{[^{}]*\})|(?:(?!keep\d\s+\{[^{}]*\})[\s\S])+$

参见proof

解释

--------------------------------------------------------------------------------
  [\s\S]*?                 any character of: whitespace (\n, \r, \t,
                           \f, and " "), non-whitespace (all but \n,
                           \r, \t, \f, and " ") (0 or more times
                           (matching the least amount possible))
--------------------------------------------------------------------------------
  (                        group and capture to :
--------------------------------------------------------------------------------
    keep                     'keep'
--------------------------------------------------------------------------------
    \d                       digits (0-9)
--------------------------------------------------------------------------------
    \s+                      whitespace (\n, \r, \t, \f, and " ") (1
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \{                       '{'
--------------------------------------------------------------------------------
    [^{}]*                   any character except: '{', '}' (0 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \}                       '}'
--------------------------------------------------------------------------------
  )                        end of 
--------------------------------------------------------------------------------
 |                        OR
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (1 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
      keep                     'keep'
--------------------------------------------------------------------------------
      \d                       digits (0-9)
--------------------------------------------------------------------------------
      \s+                      whitespace (\n, \r, \t, \f, and " ")
                               (1 or more times (matching the most
                               amount possible))
--------------------------------------------------------------------------------
      \{                       '{'
--------------------------------------------------------------------------------
      [^{}]*                   any character except: '{', '}' (0 or
                               more times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
      \}                       '}'
--------------------------------------------------------------------------------
    )                        end of look-ahead
--------------------------------------------------------------------------------
    [\s\S]                   any character of: whitespace (\n, \r,
                             \t, \f, and " "), non-whitespace (all
                             but \n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
  )+                       end of grouping
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string

您可以在 Atom 中使用这个简单的正则表达式替换来完成此任务:

\b(keep\d+\s*{[^}]*})|.+?

替换为:</code></p> <p><a href="https://regex101.com/r/oTpV9g/1" rel="nofollow noreferrer">RegEx Demo</a></p> <p><strong>正则表达式详细信息:</strong></p> <ul> <li><code>\b: 字边界

  • (keep\d+\s*{[^}]*}):在捕获组 #1 中匹配以 keep 开头的字符串,后跟 1+ 个数字,后跟 0+ 个空格,后跟 {...} 内的任何文本越界也是如此。这假设 {} 是平衡的并且没有转义 {}.
  • |:或
  • .+?: 延迟匹配 1+ 任何东西
  • PS:如果你想删除前导换行符,那么使用:

    \n?\b(keep\d+\s*{[^}]*})|.+?
    

    Atom 编辑器演示

    替换前:

    替换后: