如何将输入的开始和结束与 Visual Studio 上的 std::regex 相匹配

How to match start and end of input with std::regex on Visual Studio

据我了解,C++ 正则表达式符号 ^ 应该只匹配输入的开头,而 $ 应该只匹配输入的结尾。这可以更改为匹配带有 std::regex::multiline 标志的每一行的开头和结尾。 不幸的是 Visual Studio 2017 未能符合此行为:

#include <string>
#include <iostream>
#include <regex>
#include <exception>

int main()
{
    std::string test = "\n \n\t \nThe previous three line should be removed.\n    \nThe previous line shouldn't be removed, "
        "but the next two should be:\n\t\t\t\n  ";

    std::string out;
    try {
        std::regex re(R"(^\s*\n|\n\s*$)");
        out = std::regex_replace(test, re, "");
    }
    catch (std::exception& e) {
        std::cout << e.what() << std::endl;
    }
    std::cout << out << std::endl;
}

这将在 GCC 上保留两个文本行之间的空行,但在 MSVC 下它将被删除。有什么方法可以解决此问题,甚至更好的便携式解决方案吗?这是错误还是预期的行为?是否符合标准?

我也被这个问题挂断了。事实证明,由于历史原因,这种 non-standard 行为预计从 Visual Studio 2017 年开始,但他们希望在未来改变它。

这是一个包含更多信息的 link(为后代粘贴在下面):https://developercommunity.visualstudio.com/t/multiline-c/268592

We marked this as an LWG issue resolution rather than a feature in our C++17 support tables; this is ABI breaking for us to implement so it won't happen until the regex engine is overhauled in an ABI breaking release.

MSVC++'s engine always has "multiline" behavior, following Boost::Regex' design (from which the TR1 regex proposal was derived) which was multiline by default and had a singleline option. For some reason though, the singleline switch from Boost.Regex wasn't standardized.

The other standard libraries assumed ECMAScript/browsers' defaults, and so only had single line mode.

Through discussion in LWG it was decided that because the standard normatively references ECMAScript, and the default there is single line, that std::regex should not be like Boost.Regex here, MSVC++ would need to change its engine to be singleline by default, and all the standard libraries would add a multiline switch.

Unfortunately the representation our regex engine uses doesn't allow easy incorporation of a singleline flag to implement singleline support, and even if it did, changing that default would be likely to introduce subtle changes in behavior to existing programs that just recompile with a compiler update.

As a workaround for now, we continue to recommend Boost.Regex; it not only has more consistent behavior in this area, its performance crushes that of all 3 major standard library implementations at present ( https://www.boost.org/doc/libs/1_67_0/libs/regex/doc/html/boost_regex/background/performance.html )

Can't wait for the ABI break in the sky where we can fix this!

Billy O'Neal

Visual C++ Libraries