正则表达式匹配终止符和可变字符序列

Question

抱歉打扰了，当涉及到正则表达式时，我感到永远迷失了...

我必须匹配出现在更长的十六进制值序列中的字符串。我的测试字符串是这样的：

BF1301020302000017BF1301030101010300FF6ABF130201010300FFC0BF1303010303030100FF98

图案是这样的：

从 BF13 开始
后跟未知数量的“01”、“02”或“03”重复（\w\w）
00 标志着 BF13 和 00 之间序列的终止
在 00 终止符之后，总是有 4 个额外的字符

我试过了BF13(\w\w)+?00(\w\w){1}，但显然是错误的。

测试字符串应该匹配并输出这些值：

BF1301020302000017
BF1301030101010300FF6A
BF130201010300FFC0
BF1303010303030100FF98

谢谢大家！

Answer 1

这个就可以了：

BF13(?:0[123])+00[A-Z0-9]{4}

说明

BF13 BF13字面意思

(?:...)+ 之后至少有一次（+）

0[123] 0 后跟 1、2 或 3

00后面是00

[A-Z0-9]{4} 后跟大写字符或数字 4 次

RegExp Demo

示例PHP代码Test online

$re = '/BF13(?:0[123])+00[A-Z0-9]{4}/';
$str = 'BF1301020302000017BF1301030101010300FF6ABF130201010300FFC0BF1303010303030100FF98';

preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);

foreach ($matches as $val) {
    echo "matched: " . $val[0] . "\n";
}

Answer 2

您有两个选择：

输入：

$in = 'BF1301020302000017BF1301030101010300FF6ABF130201010300FFC0BF1303010303030100FF98';

方法 #1 - preg_match_all() (Regex Pattern Explanation/Demo):

var_export(preg_match_all('/BF13(?:0[123])+0{2}[A-F0-9]{4}/', $in, $out) ? $out[0] : []);
// *my pattern is a couple of steps faster than stej4n's
// and doesn't make the mistake of putting commas in the character class

方法 #2：- preg_split() (Regex Pattern Explanation/Demo):

var_export(preg_split('/0{2}[A-F0-9]{4}\K/', $in, 0, PREG_SPLIT_NO_EMPTY));
// K moves the match starting point -- preserving all characters when splitting
// I prefer this method because it requires a small pattern and
// it returns an array, as opposed to true/false with a variable declaration
// Another pattern for preg_split() is just slightly slower, but needs less parameters:
// preg_split('/0{2}[A-F0-9]{4}\K(?!$)/', $in)

输出（任一方式）：

array (
  0 => 'BF1301020302000017',
  1 => 'BF1301030101010300FF6A',
  2 => 'BF130201010300FFC0',
  3 => 'BF1303010303030100FF98',
)

正则表达式匹配终止符和可变字符序列

Regex matching to a terminator plus a variable character sequence

php

regex

hex

substring

preg-match-all