数组内注释中的双反斜杠

Question

我有一个数组定义如下：

extern const char config_reg[] = {  
  0x05, //comment
  0x00, //comment
  0x00, //  \    <-- double backslash
  0x01, //comment
  0x03
}

如您所见，注释中有一个双反斜杠（<-- double backslash 和前面的空格不会出现在实际的源文件中）。当我编译此代码（减去“<-- 双反斜杠”）时，它的行为就好像 following 行不存在一样 - 即相当于写作：

extern const char config_reg[] = {  
  0x05, //comment
  0x00, //comment
  0x00, //  

  0x03
}

这是预期的 C++ 行为吗？如果是，其预期目的是什么？

我正在使用 Parallax Propeller Simple IDE 来编译我的代码——从各方面来看，这不是一个特别好的编译器。编译器实现是否可能导致这种行为？

Answer 1

这是正确的，假设 <-- double backslash 和前面的空格实际上不在代码中。

单个反斜杠也会产生相同的效果。

反斜线-换行的换行拼接发生在评论分析之前，所以0x01行与// \评论是同一行的一部分，所以评论分析时看不到完成了。

ISO/IEC 14882:2011 (C++11) 标准说：

2.2 Phases of translation [lex.phases]

¶1 The precedence among the syntax rules of translation is specified by the following phases.¹¹

Physical source file characters are mapped, in an implementation-defined manner, to the basic source character set (introducing new-line characters for end-of-line indicators) if necessary. The set of physical source file characters accepted is implementation-defined. Trigraph sequences (2.4) are replaced by corresponding single-character internal representations. Any source file character not in the basic source character set (2.3) is replaced by the universal-character-name that designates that character. (An implementation may use any internal encoding, so long as an actual extended character encountered in the source file, and the same extended character expressed in the source file as a universal-character-name (i.e., using the \uXXXX notation), are handled equivalently except where this replacement is reverted in a raw string literal.)

Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. Only the last backslash on any physical source line shall be eligible for being part of such a splice. If, as a result, a character sequence that matches the syntax of a universal-character-name is produced, the behavior is undefined. A source file that is not empty and that does not end in a new-line character, or that ends in a new-line character immediately preceded by a backslash character before any such splicing takes place, shall be processed as if an additional new-line character were appended to the file.

The source file is decomposed into preprocessing tokens (2.5) and sequences of white-space characters (including comments). A source file shall not end in a partial preprocessing token or in a partial comment.¹² Each comment is replaced by one space character. New-line characters are retained. Whether each nonempty sequence of white-space characters other than new-line is retained or replaced by one space character is unspecified. The process of dividing a source file’s characters into preprocessing tokens is context-dependent. [ Example: see the handling of < within a #include preprocessing directive. —end example ]

¹¹⁾ Implementations must behave as if these separate phases occur, although in practice different phases might be folded together.

¹²⁾ A partial preprocessing token would arise from a source file ending in the first portion of a multi-character token that requires a terminating sequence of characters, such as a header-name that is missing the closing " or >. A partial comment would arise from a source file ending with an unclosed /* comment.

Answer 2

是的，翻译的第二阶段涉及"splicing physical source lines to form logical source lines"；如果一行以反斜杠结尾，则下一行被认为是该行的延续。这是标准行为。这发生在第三阶段删除评论之前，所以反斜杠出现在评论中这一事实不会改变任何东西。

C 中经常使用行拼接来将宏拆分为多行，因为预处理器指令会扩展到行尾。它在 C++ 中很少见，C++ 对宏的依赖比 C 少得多。

我相信 C 语言的最初目的是解决一些现在过时的系统中存在的行长度限制。

Answer 3

行尾的\转义换行符。

因此在您的示例中，它将注释扩展到下一行。出于审美目的，该片段的作者可能使用了 \ 而不是 \。但它不仅适用于评论。例如，这是允许的（但多余）：

int a; \
int b;

有些编译器允许 \ 和换行符之间有空格，但可能会发出警告。

数组内注释中的双反斜杠

Double backslash in comment inside array

c++

arrays

comments

2.2 Phases of translation [lex.phases]