我如何正确构建 Dart Regexp?

How I build Dart Regexp Properly?

此表达式的目标是将数学计算分离为运算符、符号、数字和括号。

例如:

Input string: 1+3-6*(12-3+4/5)

Output list: 1, +, 3, -, 6, *, (12-3+4/5)

所以我构建了this expression.

它在网页上工作,但在 Dart 代码中发生了这种情况:

final calculationExpression = RegExp(
  r"/(\(([a-zA-Z0-9-+/*]+)\))|([a-zA-Z0-9]+)|([+/*-]{1})/g",
  unicode: true,
  multiLine: true,
);

...

List<String> operators = calculationsString.split(calculationExpression); /// Output: ["", "+", "-", ...]

我做错了什么?

您将 JavaScript 正则表达式斜线和标志放在 Dart 字符串中。

如果删除前导 / 和尾随 /g,您将获得想要的 RegExp。 multiLineunicode 标志是不必要的(您的正则表达式不使用受这些影响的任何功能)

Dart split 函数不发出捕获组,因此您可能希望查看获取匹配项,而不是删除它们,这正是 split 所做的。

总而言之,试试:

final calculationExpression = RegExp(
    r"\([a-zA-Z\d\-+/*]+\)|[a-zA-Z\d]+|[+/*\-]");
List<String> tokes = 
    calculationExpression.allMatches(calculationsString).toList();
  1. 语法/pattern/g用于在JavaScript(以及sed和其他一些语言),就像引号用于创建字符串文字一样。 Dart 没有正则表达式文字;您必须直接调用 RegExp 构造函数。将正则表达式字面量语法与显式构造的 RegExp 对象相结合是没有意义的。当您执行 RegExp(r'/pattern1|pattern2|pattern3/g') 时,您实际上是在匹配 /pattern1pattern1 以文字 / 字符为前缀)或 pattern2pattern3/gpattern3 后跟文字字符串 /g).

  2. String.split does not split the input string such that each element of the result matches the pattern. It treats all matches of the pattern as separators. Consequently, the resulting list will not have any elements that match the pattern, which is the opposite of what you want. You instead want to find all matches of the pattern in the string, so you can use RegExp.allMatches if 您还要验证输入字符串是否包含 only 来自正则表达式的匹配项.

综合起来:

void main() {
  final calculationExpression = RegExp(
    r"(\(([a-zA-Z0-9-+/*]+)\))|([a-zA-Z0-9]+)|([+/*-]{1})",
    unicode: true,
    multiLine: true,
  );

  var calculationsString = '1+3-6*(12-3+4/5)';

  // Prints: [1, +, 3, -, 6, *, (12-3+4/5)]
  print(calculationsString.tokenizeFrom(calculationExpression).toList());
}

extension on String {
  Iterable<String> tokenizeFrom(RegExp regExp) sync* {
    void failIf(bool condition) {
      if (condition) {
        throw FormatException(
          '$this contains characters that do not match $regExp',
        );
      }
    }

    var matches = regExp.allMatches(this);
    var lastEnd = 0;
    for (var match in matches) {
      // Verify that there aren't unmatched characters.
      failIf(match.start != lastEnd);
      lastEnd = match.end;

      yield match.group(0)!;
    }

    failIf(lastEnd != length);
  }
}