OCaml `Str` 模块正则表达式匹配的奇数结果
Odd results of regular expression matching of OCaml `Str` module
当我执行以下测试程序时:
let re = Str.regexp "{\(foo\)\(bar\)?}"
let check s =
try
let n = Str.search_forward re s 0 in
let a = Str.matched_group 1 s in
let b = Str.matched_group 2 s in
Printf.printf "'%s' => n=%d, a='%s', b='%s'\n" s n a b
with
_ -> Printf.printf "'%s' => not found\n" s
let _ =
check "{foo}";
check "{foobar}"
我得到了奇怪的结果。即:
$ ocaml str.cma test.ml
'{foo}' => not found
'{foobar}' => n=0, a='foo', b='bar'
通过 \(
和 \)
分组是否与 ?
运算符不兼容?文档没有提到这个。
对于您的第一个示例,第 2 组没有匹配项。因此调用 Str.matched_group 2
会引发 Not_found
。
要获得比 check
函数更细粒度的结果,您需要使用自己的 try
块分别处理每个组。原则上,任何对 Str.matched_group
的调用都可以引发 Not_found
(取决于正则表达式的属性和匹配的字符串)。
我重写了你的 check
函数,如下所示:
let check s =
let check1 n g =
try
let m = Str.matched_group g s in
Printf.printf "'%s' group %d => n = %d, match = '%s'\n"
s g n m
with Not_found ->
Printf.printf "'%s' group %d => not matched\n" s g
in
let n = Str.search_forward re s 0 in
check1 n 1;
check1 n 2
修改后的代码输出如下:
'{foo}' group 1 => n = 0, match = 'foo'
'{foo}' group 2 => not matched
'{foobar}' group 1 => n = 0, match = 'foo'
'{foobar}' group 2 => n = 0, match = 'bar'
当我执行以下测试程序时:
let re = Str.regexp "{\(foo\)\(bar\)?}"
let check s =
try
let n = Str.search_forward re s 0 in
let a = Str.matched_group 1 s in
let b = Str.matched_group 2 s in
Printf.printf "'%s' => n=%d, a='%s', b='%s'\n" s n a b
with
_ -> Printf.printf "'%s' => not found\n" s
let _ =
check "{foo}";
check "{foobar}"
我得到了奇怪的结果。即:
$ ocaml str.cma test.ml
'{foo}' => not found
'{foobar}' => n=0, a='foo', b='bar'
通过 \(
和 \)
分组是否与 ?
运算符不兼容?文档没有提到这个。
对于您的第一个示例,第 2 组没有匹配项。因此调用 Str.matched_group 2
会引发 Not_found
。
要获得比 check
函数更细粒度的结果,您需要使用自己的 try
块分别处理每个组。原则上,任何对 Str.matched_group
的调用都可以引发 Not_found
(取决于正则表达式的属性和匹配的字符串)。
我重写了你的 check
函数,如下所示:
let check s =
let check1 n g =
try
let m = Str.matched_group g s in
Printf.printf "'%s' group %d => n = %d, match = '%s'\n"
s g n m
with Not_found ->
Printf.printf "'%s' group %d => not matched\n" s g
in
let n = Str.search_forward re s 0 in
check1 n 1;
check1 n 2
修改后的代码输出如下:
'{foo}' group 1 => n = 0, match = 'foo'
'{foo}' group 2 => not matched
'{foobar}' group 1 => n = 0, match = 'foo'
'{foobar}' group 2 => n = 0, match = 'bar'