编译器 (C/C++) 如何识别注释？

How does a compiler (C/C++) identify a comment?

如果我的程序有一个字符串 s1= "like/*this" , s2="like /*this is a comment */this" 和 s3 = "like //this is not a comment" 在 s1 和 s3 中，“/”和“//*”是字符串的一部分。在 s2 中，它是要在输出屏幕上显示的用户注释。 c/c++ 编译器为此使用什么算法？（我的猜测是，编译器会忽略“”内的所有文本）

不，字符串内部没有注释，所有字符都是字符串的一部分。来自 C 标准，第 6.4.9 章（注释）：

Except within a character constant, a string literal, or a comment, the characters /* introduce a comment. The contents of such a comment are examined only to identify multibyte characters and to find the characters */ that terminate it.

然后对 // 条评论采用类似的规则。

另外，有一个很好的脚注说明由于评论中的 /* 不被识别，评论不会嵌套。

关于编译器使用的算法...好吧，当对输入文件进行标记时，编译器知道它是否在字符串内部（它必须知道自己的状态），是否很容易切换到 评论模式 否

就是lexical analysis of the compiler. For C, it is tied to the preprocessing (so look into the libcpp/ directory of GCC source code). Read more about parsing & abstract syntax trees.

你应该阅读Dragon Book，它概述了编译技术（我们无法在这里用几句话解释它们）。

词法分析通常使用有限自动机（对应于正则表达式）技术来完成。在许多情况下，您可以生成词法分析器，例如使用 flex. Syntax analysis can also be generated, e.g. using bison or ANTLR（与堆栈自动机相关）。

^{（顺便说一句，当前的 GCC 6 和 7 正在使用手写词法分析器和解析器——而不是使用例如 flex & bison 生成它们——：首先管理一个许多额外的信息，例如源位置、宏扩展的方式；还有更好的错误消息；也许还有效率）}

如果你想要关于 GCC 的详细解释，我的MELT documentation web page contain a lot of references. Look also at the GCC internals documentation and of course download and study the source code of GCC. See also this。

编译器 (C/C++) 如何识别注释？

How does a compiler (C/C++) identify a comment?

c

compiler-construction