正则表达式匹配终止符和可变字符序列
Regex matching to a terminator plus a variable character sequence
抱歉打扰了,当涉及到正则表达式时,我感到永远迷失了...
我必须匹配出现在更长的十六进制值序列中的字符串。我的测试字符串是这样的:
BF1301020302000017BF1301030101010300FF6ABF130201010300FFC0BF1303010303030100FF98
图案是这样的:
- 从 BF13 开始
- 后跟未知数量的“01”、“02”或“03”重复(\w\w)
- 00 标志着 BF13 和 00 之间序列的终止
- 在 00 终止符之后,总是有 4 个额外的字符
我试过了BF13(\w\w)+?00(\w\w){1}
,但显然是错误的。
测试字符串应该匹配并输出这些值:
- BF1301020302000017
- BF1301030101010300FF6A
- BF130201010300FFC0
- BF1303010303030100FF98
谢谢大家!
这个就可以了:
BF13(?:0[123])+00[A-Z0-9]{4}
说明
BF13
BF13字面意思
(?:...)+
之后至少有一次(+
)
0[123]
0 后跟 1、2 或 3
00
后面是00
[A-Z0-9]{4}
后跟大写字符或数字 4 次
示例PHP代码Test online
$re = '/BF13(?:0[123])+00[A-Z0-9]{4}/';
$str = 'BF1301020302000017BF1301030101010300FF6ABF130201010300FFC0BF1303010303030100FF98';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
foreach ($matches as $val) {
echo "matched: " . $val[0] . "\n";
}
您有两个选择:
输入:
$in = 'BF1301020302000017BF1301030101010300FF6ABF130201010300FFC0BF1303010303030100FF98';
方法 #1 - preg_match_all() (Regex Pattern Explanation/Demo):
var_export(preg_match_all('/BF13(?:0[123])+0{2}[A-F0-9]{4}/', $in, $out) ? $out[0] : []);
// *my pattern is a couple of steps faster than stej4n's
// and doesn't make the mistake of putting commas in the character class
方法 #2:- preg_split() (Regex Pattern Explanation/Demo):
var_export(preg_split('/0{2}[A-F0-9]{4}\K/', $in, 0, PREG_SPLIT_NO_EMPTY));
// K moves the match starting point -- preserving all characters when splitting
// I prefer this method because it requires a small pattern and
// it returns an array, as opposed to true/false with a variable declaration
// Another pattern for preg_split() is just slightly slower, but needs less parameters:
// preg_split('/0{2}[A-F0-9]{4}\K(?!$)/', $in)
输出(任一方式):
array (
0 => 'BF1301020302000017',
1 => 'BF1301030101010300FF6A',
2 => 'BF130201010300FFC0',
3 => 'BF1303010303030100FF98',
)
抱歉打扰了,当涉及到正则表达式时,我感到永远迷失了...
我必须匹配出现在更长的十六进制值序列中的字符串。我的测试字符串是这样的:
BF1301020302000017BF1301030101010300FF6ABF130201010300FFC0BF1303010303030100FF98
图案是这样的:
- 从 BF13 开始
- 后跟未知数量的“01”、“02”或“03”重复(\w\w)
- 00 标志着 BF13 和 00 之间序列的终止
- 在 00 终止符之后,总是有 4 个额外的字符
我试过了BF13(\w\w)+?00(\w\w){1}
,但显然是错误的。
测试字符串应该匹配并输出这些值:
- BF1301020302000017
- BF1301030101010300FF6A
- BF130201010300FFC0
- BF1303010303030100FF98
谢谢大家!
这个就可以了:
BF13(?:0[123])+00[A-Z0-9]{4}
说明
BF13
BF13字面意思
(?:...)+
之后至少有一次(+
)
0[123]
0 后跟 1、2 或 3
00
后面是00
[A-Z0-9]{4}
后跟大写字符或数字 4 次
示例PHP代码Test online
$re = '/BF13(?:0[123])+00[A-Z0-9]{4}/';
$str = 'BF1301020302000017BF1301030101010300FF6ABF130201010300FFC0BF1303010303030100FF98';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
foreach ($matches as $val) {
echo "matched: " . $val[0] . "\n";
}
您有两个选择:
输入:
$in = 'BF1301020302000017BF1301030101010300FF6ABF130201010300FFC0BF1303010303030100FF98';
方法 #1 - preg_match_all() (Regex Pattern Explanation/Demo):
var_export(preg_match_all('/BF13(?:0[123])+0{2}[A-F0-9]{4}/', $in, $out) ? $out[0] : []);
// *my pattern is a couple of steps faster than stej4n's
// and doesn't make the mistake of putting commas in the character class
方法 #2:- preg_split() (Regex Pattern Explanation/Demo):
var_export(preg_split('/0{2}[A-F0-9]{4}\K/', $in, 0, PREG_SPLIT_NO_EMPTY));
// K moves the match starting point -- preserving all characters when splitting
// I prefer this method because it requires a small pattern and
// it returns an array, as opposed to true/false with a variable declaration
// Another pattern for preg_split() is just slightly slower, but needs less parameters:
// preg_split('/0{2}[A-F0-9]{4}\K(?!$)/', $in)
输出(任一方式):
array (
0 => 'BF1301020302000017',
1 => 'BF1301030101010300FF6A',
2 => 'BF130201010300FFC0',
3 => 'BF1303010303030100FF98',
)