我如何正确构建 Dart Regexp?
How I build Dart Regexp Properly?
此表达式的目标是将数学计算分离为运算符、符号、数字和括号。
例如:
Input string: 1+3-6*(12-3+4/5)
Output list: 1, +, 3, -, 6, *, (12-3+4/5)
所以我构建了this expression.
它在网页上工作,但在 Dart 代码中发生了这种情况:
final calculationExpression = RegExp(
r"/(\(([a-zA-Z0-9-+/*]+)\))|([a-zA-Z0-9]+)|([+/*-]{1})/g",
unicode: true,
multiLine: true,
);
...
List<String> operators = calculationsString.split(calculationExpression); /// Output: ["", "+", "-", ...]
我做错了什么?
您将 JavaScript 正则表达式斜线和标志放在 Dart 字符串中。
如果删除前导 /
和尾随 /g
,您将获得想要的 RegExp。
multiLine
和 unicode
标志是不必要的(您的正则表达式不使用受这些影响的任何功能)
Dart split
函数不发出捕获组,因此您可能希望查看获取匹配项,而不是删除它们,这正是 split
所做的。
总而言之,试试:
final calculationExpression = RegExp(
r"\([a-zA-Z\d\-+/*]+\)|[a-zA-Z\d]+|[+/*\-]");
List<String> tokes =
calculationExpression.allMatches(calculationsString).toList();
语法/pattern/g
用于在JavaScript(以及sed
和其他一些语言),就像引号用于创建字符串文字一样。 Dart 没有正则表达式文字;您必须直接调用 RegExp
构造函数。将正则表达式字面量语法与显式构造的 RegExp
对象相结合是没有意义的。当您执行 RegExp(r'/pattern1|pattern2|pattern3/g')
时,您实际上是在匹配 /pattern1
(pattern1
以文字 /
字符为前缀)或 pattern2
或 pattern3/g
( pattern3
后跟文字字符串 /g
).
String.split
does not split the input string such that each element of the result matches the pattern. It treats all matches of the pattern as separators. Consequently, the resulting list will not have any elements that match the pattern, which is the opposite of what you want. You instead want to find all matches of the pattern in the string, so you can use RegExp.allMatches
if 您还要验证输入字符串是否包含 only 来自正则表达式的匹配项.
综合起来:
void main() {
final calculationExpression = RegExp(
r"(\(([a-zA-Z0-9-+/*]+)\))|([a-zA-Z0-9]+)|([+/*-]{1})",
unicode: true,
multiLine: true,
);
var calculationsString = '1+3-6*(12-3+4/5)';
// Prints: [1, +, 3, -, 6, *, (12-3+4/5)]
print(calculationsString.tokenizeFrom(calculationExpression).toList());
}
extension on String {
Iterable<String> tokenizeFrom(RegExp regExp) sync* {
void failIf(bool condition) {
if (condition) {
throw FormatException(
'$this contains characters that do not match $regExp',
);
}
}
var matches = regExp.allMatches(this);
var lastEnd = 0;
for (var match in matches) {
// Verify that there aren't unmatched characters.
failIf(match.start != lastEnd);
lastEnd = match.end;
yield match.group(0)!;
}
failIf(lastEnd != length);
}
}
此表达式的目标是将数学计算分离为运算符、符号、数字和括号。
例如:
Input string: 1+3-6*(12-3+4/5)
Output list: 1, +, 3, -, 6, *, (12-3+4/5)
所以我构建了this expression.
它在网页上工作,但在 Dart 代码中发生了这种情况:
final calculationExpression = RegExp(
r"/(\(([a-zA-Z0-9-+/*]+)\))|([a-zA-Z0-9]+)|([+/*-]{1})/g",
unicode: true,
multiLine: true,
);
...
List<String> operators = calculationsString.split(calculationExpression); /// Output: ["", "+", "-", ...]
我做错了什么?
您将 JavaScript 正则表达式斜线和标志放在 Dart 字符串中。
如果删除前导 /
和尾随 /g
,您将获得想要的 RegExp。
multiLine
和 unicode
标志是不必要的(您的正则表达式不使用受这些影响的任何功能)
Dart split
函数不发出捕获组,因此您可能希望查看获取匹配项,而不是删除它们,这正是 split
所做的。
总而言之,试试:
final calculationExpression = RegExp(
r"\([a-zA-Z\d\-+/*]+\)|[a-zA-Z\d]+|[+/*\-]");
List<String> tokes =
calculationExpression.allMatches(calculationsString).toList();
语法
/pattern/g
用于在JavaScript(以及sed
和其他一些语言),就像引号用于创建字符串文字一样。 Dart 没有正则表达式文字;您必须直接调用RegExp
构造函数。将正则表达式字面量语法与显式构造的RegExp
对象相结合是没有意义的。当您执行RegExp(r'/pattern1|pattern2|pattern3/g')
时,您实际上是在匹配/pattern1
(pattern1
以文字/
字符为前缀)或pattern2
或pattern3/g
(pattern3
后跟文字字符串/g
).String.split
does not split the input string such that each element of the result matches the pattern. It treats all matches of the pattern as separators. Consequently, the resulting list will not have any elements that match the pattern, which is the opposite of what you want. You instead want to find all matches of the pattern in the string, so you can useRegExp.allMatches
if 您还要验证输入字符串是否包含 only 来自正则表达式的匹配项.
综合起来:
void main() {
final calculationExpression = RegExp(
r"(\(([a-zA-Z0-9-+/*]+)\))|([a-zA-Z0-9]+)|([+/*-]{1})",
unicode: true,
multiLine: true,
);
var calculationsString = '1+3-6*(12-3+4/5)';
// Prints: [1, +, 3, -, 6, *, (12-3+4/5)]
print(calculationsString.tokenizeFrom(calculationExpression).toList());
}
extension on String {
Iterable<String> tokenizeFrom(RegExp regExp) sync* {
void failIf(bool condition) {
if (condition) {
throw FormatException(
'$this contains characters that do not match $regExp',
);
}
}
var matches = regExp.allMatches(this);
var lastEnd = 0;
for (var match in matches) {
// Verify that there aren't unmatched characters.
failIf(match.start != lastEnd);
lastEnd = match.end;
yield match.group(0)!;
}
failIf(lastEnd != length);
}
}