我怎样才能只匹配数字子串只被字母或空白字符包围的行？

Question

所以我想找出一种更轻松地搜索数据的方法，目前我有一个 grep 命令可以帮我完成这项工作。但是，这个 grep 命令并不完美，我正在尝试弄清楚它是否可以改进。

假设我们在 grepping 目录中的某些文件中有以下文本行，其中包含随机字母数字字符串，可能有也可能没有空格：

2001 abc20abcdef
abcd2012 a20abcdef abcdefg
2006 21abcdef 
abc2021 abcde abc18abcd
ab2015ababcd20ababcd

我们还假设这些字符串中的数字只会以两位数的形式出现，除非字符串中包含年份。例如，一个字符串可以有 100 个字符长，但该字符串中只会有两个数字字符，除非有年份，在这种情况下，字符串中会有 6 个数字字符。年份永远不会紧挨着目标数字，因此字符串永远不会包含 abc201820abc 例如。

为了这个例子，我想 return 包含 20 的行 除非它们看起来像一年 。如果同一行中既有年份又有 20，那么我确实想要 return 那一行，但如果只有一年没有 20，我就不会。例如，I'我想 return:

2001 abc20abcdef
abcd2012 a20abcdef abcdefg
ab2015ababcd20ababcd

但不是return:

2006 21abcdef 
abc2021 abcde abc18abcd

我当前的 grep 非常基础，只会 return 所有包含 20 的行，这在技术上是我想要的，但给了我无用的行和有用的行。我怎样才能缩小范围？

当前 grep:

grep -rn 20 .

这将 return 所有 5 行，其中 3 行是我想要的，2 行是我不想要的。

我下面有一些伪代码逻辑可以给我我想要的东西，但我不知道如何把它变成 grep/script:

for each line in files {
    if (line contains the number 20 three times) // for example abc2020abcde20abc
        add line to results;
    if (line contains the number 20 twice and both 20s are not immediately next to each other) // This will avoid a false hit of the year 2020
        add line to results;
    else if (line contains the number 20 once) {
        if (an alphabetic character or whitespace follows the 20)
            add line to results;
        else
            do not add line to results;
    }
}

有什么想法吗？所有 help/opinions 将不胜感激！

编辑：我想到了一个更好的伪代码，但仍然不知道如何将它变成 grep：

for each line in files {
    if (line contains an instance where the number 20 has only alphabetic characters or whitespace on either side of it)
        add line to results;
    else
        do not add line to results;
}

Answer 1

line contains an instance where the number 20 has only alphabetic characters or whitespace on either side of it

翻译成

grep -Ei '[a-z \t]20[a-z \t]'

但您可能想改用以下内容，它还会在行的开头或结尾或标点符号旁边打印包含 20 的行。

grep -E '(^|[^0-9])20([^0-9]|$)'

我怎样才能只匹配数字子串只被字母或空白字符包围的行？

How can I match only lines where a numeric substring is surrounded only by alpha or whitespace characters?

unix

bash

grep