解析器如何处理预处理器和条件编译?

How parsers handle preprocessors and conditional compilation?

我想弄清楚解析器如何处理预处理器和条件编译。以 c++ 为例,预处理器指令是否包含在 c++ 语法规则中,或者它是一种单独的语言,预处理发生在解析之前。在这两种情况下,解析器如何找出所有可能分支中的错误并在预处理之前检索有关原始代码布局的信息(例如发生错误的行数)?

摘自the C Preprocessor docs:

The C preprocessor informs the C compiler of the location in your source code where each token came from.

所以在 GCC 的情况下,解析器知道错误发生的位置,因为预处理器告诉它。我不确定这个引用是指预处理标记,还是所有 C++ 标记。

This page 有更多关于魔法如何发生的细节。

The cpp_token structure contains line and col members. The lexer fills these in with the line and column of the first character of the token. Consequently, but maybe unexpectedly, a token from the replacement list of a macro expansion carries the location of the token within the #define directive, because cpplib expands a macro by returning pointers to the tokens in its replacement list.

[...] This variable therefore uniquely enumerates each line in the translation unit. With some simple infrastructure, it is straight forward to map from this to the original source file and line number pair

Here 是 C++14(?) 标准草案的副本。预处理语法在附录 A.14 中。我不确定是否要将其称为单独的语言是否重要。根据 [lex.phases](第 2.2 节),C++ 编译器的行为 就好像 预处理发生在主要 translation/parsing 发生之前。