处理文件结尾
Handling end of file
我正在尝试用 bison 编写一个可重入的 C++ 解析器,我遵循 Complete C++ Example 并且我正在使用 bison 3.5.1 和 flex 2.6.4(通过 apt-get 在 [=46 中可用的版本) =]克星)。
完成后我在词法分析器中遇到了问题
‘make_YYEOF’ is not a member of ‘yy::parser’
我的解决方案是声明 %token YYEOF
然后编译,但解析器给出 syntax error, unexpected YYEOF, expecting $end
,显然是有效的输入。
然后我制定了简化规则 unit: IDENTIFIER YYEOF
但解析器再次报告相同的错误。
/*
The grammar file parser.yy starts by asking for the C++ deterministic
parser skeleton, the creation of the parser header file.
Because the C++ skeleton changed several times, it is safer to require
the version you designed the grammar for.
*/
// %skeleton "lalr1.cc" // -*- C++ -*-
%require "3.0"
%language "c++"
// %header
/* Because our scanner returns only genuine tokens and never simple characters
(e.g., it returns ‘PLUS’, not ‘'+'’), we can avoid conversions. */
%define api.token.raw
/* This example uses genuine C++ objects as semantic values, therefore,
we require the variant-based storage of semantic values. To make sure
we properly use it, we enable assertions. To fully benefit from type-safety
and more natural definition of “symbol”, we enable api.token.constructor. */
%define api.token.constructor
%define api.value.type variant
%define parse.assert
/* Then come the declarations/inclusions needed by the semantic values.
Because the parser uses the parsing driver and reciprocally, both would
like to include the header of the other, which is, of course, insane.
This mutual dependency will be broken using forward declarations.
Because the driver’s header needs detailed knowledge about the parser class
(in particular its inner types), it is the parser’s header which will use
a forward declaration of the driver.
See https://www.gnu.org/software/bison/manual/html_node/_0025code-Summary.html.
Code qualifies for C/C++: requires, provides, top.
Unqualified code will be equivalent to %{ %} for most purposes */
%code requires {
# include <string>
class driver;
}
/* The driver is passed by reference to the parser and to the scanner.
This provides a simple but effective pure interface, not relying on global variables.
*/
%param { driver& drv }
%locations
/* Use the following two directives to enable parser tracing and detailed error messages.
However, detailed error messages can contain incorrect information if lookahead correction
is not enabled (see LAC https://www.gnu.org/software/bison/manual/html_node/LAC.html). */
%define parse.trace
%define parse.error verbose
%define parse.lac full // none or full
/** The code between ‘%code {’ and ‘}’ is output in the *.cc file;
it needs detailed knowledge about the driver. */
%code {
# include "driver.hh"
}
/* User friendly names are provided for each symbol.
To avoid name clashes in the generated files (see Calc++ Scanner),
prefix tokens with TOK_ */
%define api.token.prefix {TOK_}
%token YYEOF;
/* Since we use variant-based semantic values, %union is not used,
and %token, %nterm and %type expect genuine types, not type tags. */
%token <std::string> IDENTIFIER "identifier"
/* No %destructor is needed to enable memory deallocation during error recovery;
the memory, for strings for instance, will be reclaimed by the regular destructors.
All the values are printed using their operator<< (see Printing Semantic Values
https://www.gnu.org/software/bison/manual/html_node/Printer-Decl.html).
*/
%printer { yyo << $$; } <*>;
/* The grammar itself is straightforward (see Location Tracking Calculator: ltcalc
https://www.gnu.org/software/bison/manual/html_node/Location-Tracking-Calc.html). */
%%
%start unit;
unit: IDENTIFIER YYEOF // assignments exp { drv.result = ; };
%%
/* Finally the error member function reports the errors. */
void
yy::parser::error (const location_type& l, const std::string& m)
{
std::cerr << l << ": " << m << '\n';
}
driver.cc
#include "driver.hh"
#include "parser.hh"
driver::driver ()
: trace_parsing (false), trace_scanning (false){}
int driver::parse (const std::string &f)
{
file = f;
location.initialize (&file);
scan_begin ();
yy::parser parse (*this);
parse.set_debug_level (trace_parsing);
int res = parse ();
scan_end ();
return res;
}
driver.hh
/*
To support a pure interface with the parser (and the scanner) the technique of the
"parsing context" is convenient: a structure containing all the data to exchange.
Since, in addition to simply launch the parsing, there are several auxiliary tasks
to execute (open the file for scanning, instantiate the parser etc.), we recommend
transforming the simple parsing context structure into a fully blown parsing driver class.
The declaration of this driver class, in driver.hh, is as follows. The first part includes
the CPP guard and imports the required standard library components,
and the declaration of the parser class.
*/
#ifndef DRIVER_HH
# define DRIVER_HH
# include <string>
# include <map>
# include "parser.hh"
/*
Then comes the declaration of the scanning function. Flex expects the signature
of yylex to be defined in the macro YY_DECL, and the C++ parser expects it to be
declared. We can factor both as follows.
*/
// Give Flex the prototype of yylex we want ...
# define YY_DECL \
yy::parser::symbol_type yylex (driver& drv)
// ... and declare it for the parser's sake.
YY_DECL;
/* The driver class is then declared with its most obvious members. */
// Conducting the whole scanning and parsing of Calc++.
class driver
{
public:
driver ();
std::map<std::string, int> variables;
int result;
// Run the parser on file F. Return 0 on success.
int parse (const std::string& f);
// The name of the file being parsed.
std::string file;
// Whether to generate parser debug traces.
bool trace_parsing;
// Handling the scanner.
void scan_begin ();
void scan_end ();
// Whether to generate scanner debug traces.
bool trace_scanning;
// The token's location used by the scanner.
yy::location location;
};
#endif // ! DRIVER_HH
id.cc
#include <iostream>
#include "driver.hh"
int main (int argc, char *argv[])
{
int res = 0;
driver drv;
for (int i = 1; i < argc; ++i)
if (argv[i] == std::string ("-p"))
drv.trace_parsing = true;
else if (argv[i] == std::string ("-s"))
drv.trace_scanning = true;
else if (!drv.parse (argv[i]))
std::cout << drv.result << '\n';
else
res = 1;
return res;
}
bison -d parser.yy -o parser.cc
flex -o lex.yy.cc lexer.l
g++ -std=c++11 *.cc -o parser
echo b | ./parser -p -s -
这给了我
Starting parse
Entering state 0
Reading a token: --(end of buffer or a NUL)
--accepting rule at line 66 ("b")
Next token is token "identifier" (-:1.1: b)
Shifting token "identifier" (-:1.1: b)
Entering state 1
Reading a token: --(end of buffer or a NUL)
--accepting rule at line 65 ("
")
--(end of buffer or a NUL)
--EOF (start condition 0)
Next token is token YYEOF (-:2.1: )
Shifting token YYEOF (-:2.1: )
Entering state 3
Reducing stack by rule 1 (line 86):
= token "identifier" (-:1.1: b)
= token YYEOF (-:2.1: )
-> $$ = nterm unit (-:1.1-2.0: )
Stack now 0
Entering state 2
Reading a token: --(end of buffer or a NUL)
--EOF (start condition 0)
Next token is token YYEOF (-:2.1: )
LAC: initial context established for YYEOF
LAC: checking lookahead YYEOF: Err
LAC: checking lookahead $end: S4
LAC: checking lookahead YYEOF: Err
LAC: checking lookahead "identifier": Err
-:2.1: syntax error, unexpected YYEOF, expecting $end
Error: popping nterm unit (-:1.1-2.0: )
Stack now 0
Cleanup: discarding lookahead token YYEOF (-:2.1: )
我必须更改什么才能优雅地处理输入结束?
谢谢
通过声明
解决了这个问题
%token YYEOF 0
之后我搜索了一个答案,我发现 here
A token kind code of zero is returned if the end-of-input is encountered. (Bison recognizes any nonpositive value as indicating end-of-input.)
如果您使用 bison-3.5.1,您还需要使用 bison-3.5.1 的文档。您正在尝试应用最新的 bison 版本(现在是 3.8.1)的文档,并且 C++ 接口在一些细节上发生了变化,其中之一就是您如何发出输入结束标记的信号。
在 bison-3.5.1 中,有必要使用标记号 0 的显式声明为输入结束标记命名,例如:
%token YYEOF 0
(在 bison 3.5.1 的实际 calc++ 示例中,他们使用名称 END
,而不是 YYEOF
。在其他示例中,使用其他名称。)
一旦您声明了令牌名称,Bison 将生成一个适当的 make_
成员函数。
2020 年初的某个时候,YYEOF
作为输入结束标记的内置名称引入。从那时起就没有必要提供明确的 %token
声明,并且 make_YYEOF
是自动生成的。所以当前版本的示例程序不包含该声明。
您可能需要考虑更新您的 bison 安装。如果没有,您应该安装软件包 bison-doc
,其中将包含一些示例(在 /usr/share/doc/bison-doc/examples 中)和与 bison 相同版本的各种文档格式(pdf、html 和信息)你已经安装了。完成后,您应该能够使用命令 info bison
或将浏览器指向 file:///usr/share/doc/bison-doc/html/index.html
.
在本地查看文档
我正在尝试用 bison 编写一个可重入的 C++ 解析器,我遵循 Complete C++ Example 并且我正在使用 bison 3.5.1 和 flex 2.6.4(通过 apt-get 在 [=46 中可用的版本) =]克星)。
完成后我在词法分析器中遇到了问题
‘make_YYEOF’ is not a member of ‘yy::parser’
我的解决方案是声明 %token YYEOF
然后编译,但解析器给出 syntax error, unexpected YYEOF, expecting $end
,显然是有效的输入。
然后我制定了简化规则 unit: IDENTIFIER YYEOF
但解析器再次报告相同的错误。
/*
The grammar file parser.yy starts by asking for the C++ deterministic
parser skeleton, the creation of the parser header file.
Because the C++ skeleton changed several times, it is safer to require
the version you designed the grammar for.
*/
// %skeleton "lalr1.cc" // -*- C++ -*-
%require "3.0"
%language "c++"
// %header
/* Because our scanner returns only genuine tokens and never simple characters
(e.g., it returns ‘PLUS’, not ‘'+'’), we can avoid conversions. */
%define api.token.raw
/* This example uses genuine C++ objects as semantic values, therefore,
we require the variant-based storage of semantic values. To make sure
we properly use it, we enable assertions. To fully benefit from type-safety
and more natural definition of “symbol”, we enable api.token.constructor. */
%define api.token.constructor
%define api.value.type variant
%define parse.assert
/* Then come the declarations/inclusions needed by the semantic values.
Because the parser uses the parsing driver and reciprocally, both would
like to include the header of the other, which is, of course, insane.
This mutual dependency will be broken using forward declarations.
Because the driver’s header needs detailed knowledge about the parser class
(in particular its inner types), it is the parser’s header which will use
a forward declaration of the driver.
See https://www.gnu.org/software/bison/manual/html_node/_0025code-Summary.html.
Code qualifies for C/C++: requires, provides, top.
Unqualified code will be equivalent to %{ %} for most purposes */
%code requires {
# include <string>
class driver;
}
/* The driver is passed by reference to the parser and to the scanner.
This provides a simple but effective pure interface, not relying on global variables.
*/
%param { driver& drv }
%locations
/* Use the following two directives to enable parser tracing and detailed error messages.
However, detailed error messages can contain incorrect information if lookahead correction
is not enabled (see LAC https://www.gnu.org/software/bison/manual/html_node/LAC.html). */
%define parse.trace
%define parse.error verbose
%define parse.lac full // none or full
/** The code between ‘%code {’ and ‘}’ is output in the *.cc file;
it needs detailed knowledge about the driver. */
%code {
# include "driver.hh"
}
/* User friendly names are provided for each symbol.
To avoid name clashes in the generated files (see Calc++ Scanner),
prefix tokens with TOK_ */
%define api.token.prefix {TOK_}
%token YYEOF;
/* Since we use variant-based semantic values, %union is not used,
and %token, %nterm and %type expect genuine types, not type tags. */
%token <std::string> IDENTIFIER "identifier"
/* No %destructor is needed to enable memory deallocation during error recovery;
the memory, for strings for instance, will be reclaimed by the regular destructors.
All the values are printed using their operator<< (see Printing Semantic Values
https://www.gnu.org/software/bison/manual/html_node/Printer-Decl.html).
*/
%printer { yyo << $$; } <*>;
/* The grammar itself is straightforward (see Location Tracking Calculator: ltcalc
https://www.gnu.org/software/bison/manual/html_node/Location-Tracking-Calc.html). */
%%
%start unit;
unit: IDENTIFIER YYEOF // assignments exp { drv.result = ; };
%%
/* Finally the error member function reports the errors. */
void
yy::parser::error (const location_type& l, const std::string& m)
{
std::cerr << l << ": " << m << '\n';
}
driver.cc
#include "driver.hh"
#include "parser.hh"
driver::driver ()
: trace_parsing (false), trace_scanning (false){}
int driver::parse (const std::string &f)
{
file = f;
location.initialize (&file);
scan_begin ();
yy::parser parse (*this);
parse.set_debug_level (trace_parsing);
int res = parse ();
scan_end ();
return res;
}
driver.hh
/*
To support a pure interface with the parser (and the scanner) the technique of the
"parsing context" is convenient: a structure containing all the data to exchange.
Since, in addition to simply launch the parsing, there are several auxiliary tasks
to execute (open the file for scanning, instantiate the parser etc.), we recommend
transforming the simple parsing context structure into a fully blown parsing driver class.
The declaration of this driver class, in driver.hh, is as follows. The first part includes
the CPP guard and imports the required standard library components,
and the declaration of the parser class.
*/
#ifndef DRIVER_HH
# define DRIVER_HH
# include <string>
# include <map>
# include "parser.hh"
/*
Then comes the declaration of the scanning function. Flex expects the signature
of yylex to be defined in the macro YY_DECL, and the C++ parser expects it to be
declared. We can factor both as follows.
*/
// Give Flex the prototype of yylex we want ...
# define YY_DECL \
yy::parser::symbol_type yylex (driver& drv)
// ... and declare it for the parser's sake.
YY_DECL;
/* The driver class is then declared with its most obvious members. */
// Conducting the whole scanning and parsing of Calc++.
class driver
{
public:
driver ();
std::map<std::string, int> variables;
int result;
// Run the parser on file F. Return 0 on success.
int parse (const std::string& f);
// The name of the file being parsed.
std::string file;
// Whether to generate parser debug traces.
bool trace_parsing;
// Handling the scanner.
void scan_begin ();
void scan_end ();
// Whether to generate scanner debug traces.
bool trace_scanning;
// The token's location used by the scanner.
yy::location location;
};
#endif // ! DRIVER_HH
id.cc
#include <iostream>
#include "driver.hh"
int main (int argc, char *argv[])
{
int res = 0;
driver drv;
for (int i = 1; i < argc; ++i)
if (argv[i] == std::string ("-p"))
drv.trace_parsing = true;
else if (argv[i] == std::string ("-s"))
drv.trace_scanning = true;
else if (!drv.parse (argv[i]))
std::cout << drv.result << '\n';
else
res = 1;
return res;
}
bison -d parser.yy -o parser.cc
flex -o lex.yy.cc lexer.l
g++ -std=c++11 *.cc -o parser
echo b | ./parser -p -s -
这给了我
Starting parse
Entering state 0
Reading a token: --(end of buffer or a NUL)
--accepting rule at line 66 ("b")
Next token is token "identifier" (-:1.1: b)
Shifting token "identifier" (-:1.1: b)
Entering state 1
Reading a token: --(end of buffer or a NUL)
--accepting rule at line 65 ("
")
--(end of buffer or a NUL)
--EOF (start condition 0)
Next token is token YYEOF (-:2.1: )
Shifting token YYEOF (-:2.1: )
Entering state 3
Reducing stack by rule 1 (line 86):
= token "identifier" (-:1.1: b)
= token YYEOF (-:2.1: )
-> $$ = nterm unit (-:1.1-2.0: )
Stack now 0
Entering state 2
Reading a token: --(end of buffer or a NUL)
--EOF (start condition 0)
Next token is token YYEOF (-:2.1: )
LAC: initial context established for YYEOF
LAC: checking lookahead YYEOF: Err
LAC: checking lookahead $end: S4
LAC: checking lookahead YYEOF: Err
LAC: checking lookahead "identifier": Err
-:2.1: syntax error, unexpected YYEOF, expecting $end
Error: popping nterm unit (-:1.1-2.0: )
Stack now 0
Cleanup: discarding lookahead token YYEOF (-:2.1: )
我必须更改什么才能优雅地处理输入结束?
谢谢
通过声明
解决了这个问题%token YYEOF 0
之后我搜索了一个答案,我发现 here
A token kind code of zero is returned if the end-of-input is encountered. (Bison recognizes any nonpositive value as indicating end-of-input.)
如果您使用 bison-3.5.1,您还需要使用 bison-3.5.1 的文档。您正在尝试应用最新的 bison 版本(现在是 3.8.1)的文档,并且 C++ 接口在一些细节上发生了变化,其中之一就是您如何发出输入结束标记的信号。
在 bison-3.5.1 中,有必要使用标记号 0 的显式声明为输入结束标记命名,例如:
%token YYEOF 0
(在 bison 3.5.1 的实际 calc++ 示例中,他们使用名称 END
,而不是 YYEOF
。在其他示例中,使用其他名称。)
一旦您声明了令牌名称,Bison 将生成一个适当的 make_
成员函数。
2020 年初的某个时候,YYEOF
作为输入结束标记的内置名称引入。从那时起就没有必要提供明确的 %token
声明,并且 make_YYEOF
是自动生成的。所以当前版本的示例程序不包含该声明。
您可能需要考虑更新您的 bison 安装。如果没有,您应该安装软件包 bison-doc
,其中将包含一些示例(在 /usr/share/doc/bison-doc/examples 中)和与 bison 相同版本的各种文档格式(pdf、html 和信息)你已经安装了。完成后,您应该能够使用命令 info bison
或将浏览器指向 file:///usr/share/doc/bison-doc/html/index.html
.