从字符串中提取特定的正则表达式结果

Question

我正在尝试从字符串中提取部件号。我将遍历项目，如果项目长度超过 4 个字符，并且至少包含 1 个数字，则需要提取该项目。它不一定包含字母，但可以。

例如：

Line1: 'There is some random information here'
Line2: 'This includes item p23344dd5 as well as other info'
Line3: 'K3455 0.00'
Line4: 'Last part number here 5551234'

我需要的是提取 3 个项目编号，p23344dd5、K3455 和 5551234。

我正在使用这段代码，但如果它匹配，它只是 returns，这不是我需要的。我需要 return 匹配的文本。

import re

items = ['There is some random information here',
         'This includes item p23344dd5 as well as other info',
         'K3455 0.00',
         'Line4: ''Last part number here 5551234']

for item in items:
    x = re.search(r'^(?=.*\d).{5,}$', item)
    print(x)

Answer 1

要匹配问题中的值，您可以从空白边界断言至少 5 个单词字符，然后至少匹配一个数字。

(?<!\S)(?=\w{5})[^\W\d]*\d\w*(?!\S)

说明

(?<!\S) 左边的空白边界
(?=\w{5}) 断言 5 个字字符
[^\W\d]* 匹配不带数字的可选单词字符
\d 匹配 1 个数字
\w* 匹配可选的单词字符
(?!\S) 在右侧断言空白边界

regex demo | Python demo

import re

items = ['There is some random information here',
         'This includes item p23344dd5 as well as other info',
         'K3455 0.00',
         'Line4: ''Last part number here 5551234']

for item in items:
    x = re.search(r'(?<!\S)(?=\w{5})\w*\d\w*(?!\S)', item)
    if x:
        print(x.group())

p23344dd5
K3455
5551234

Answer 2

下面是提取匹配文本的方法。如评论中所述，这不能解决正则表达式的问题，但会按照您的要求提取匹配值。问题是整行与您编写正则表达式的方式匹配。

import re

items = ['There is some random information here',
         'This includes item p23344dd5 as well as other info',
         'K3455 0.00',
         'Line4: ''Last part number here 5551234']

for item in items:
    m = re.search(r'^(?=.*\d).{5,}$', item)
    if m is not None:
        print(m.group(0))

从字符串中提取特定的正则表达式结果

Extracting Specific Regex result from string

python

regex

python-re