在 if/else 语句后评估打印语句时,Lark Parser 引发错误
Lark Parser raising error when evaluating print statement after if/else statement
所以我正在使用 python 和 lark
库制作一种编程语言进行解析。当我解析以下内容时
if 5 == 4 {
print("TRUE");
}
else {
print("FALSE");
}
print("Done!");
它引发了以下错误
PS E:\ParserAndLexer> & C:/Python38/python.exe e:/ParserAndLexer/lite/lite_transformer.py
Traceback (most recent call last):
File "C:\Python38\lib\site-packages\lark\lexer.py", line 416, in lex
for x in l.lex(stream, self.root_lexer.newline_types, self.root_lexer.ignore_types):
File "C:\Python38\lib\site-packages\lark\lexer.py", line 200, in lex
raise UnexpectedCharacters(stream, line_ctr.char_pos, line_ctr.line, line_ctr.column, allowed=allowed, state=self.state, token_history=last_token and [last_token])
lark.exceptions.UnexpectedCharacters: No terminal defined for 'p' at line 7 col 1
print("HI");
^
Expecting: {'IF'}
Previous tokens: Token('RBRACE', '}')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "e:/ParserAndLexer/lite/lite_transformer.py", line 80, in <module>
tree = parser.parse(lite_code)
File "C:\Python38\lib\site-packages\lark\lark.py", line 464, in parse
return self.parser.parse(text, start=start)
File "C:\Python38\lib\site-packages\lark\parser_frontends.py", line 148, in parse
return self._parse(token_stream, start, set_parser_state)
File "C:\Python38\lib\site-packages\lark\parser_frontends.py", line 63, in _parse
return self.parser.parse(input, start, *args)
File "C:\Python38\lib\site-packages\lark\parsers\lalr_parser.py", line 35, in parse
return self.parser.parse(*args)
File "C:\Python38\lib\site-packages\lark\parsers\lalr_parser.py", line 86, in parse
for token in stream:
File "C:\Python38\lib\site-packages\lark\indenter.py", line 32, in _process
for token in stream:
File "C:\Python38\lib\site-packages\lark\lexer.py", line 431, in lex
raise UnexpectedToken(t, e.allowed, state=e.state)
lark.exceptions.UnexpectedToken: Unexpected token Token('NAME', 'print') at line 7, column 1.
Expected one of:
* IF
我不明白为什么会这样,我的代码大概是这样的:
from lark import Lark, Transformer, v_args
from lark.indenter import Indenter
class MainIndenter(Indenter):
NL_type = '_NL'
OPEN_PAREN_types = ['LPAR', 'LBRACE']
CLOSE_PAREN_types = ['RPAR', 'RBRACE']
INDENT_TYPE = '_INDENT'
DEDENT_type = '_DEDENT'
tab_len = 8
@v_args(inline=True)
class MainTransformer(Transformer):
def __init__(self):
...
def number(self, value):
return Integer(value)
def string(self, value):
value = str(value).strip('"')
return String(value)
def div(self, val1, val2):
return Div(val1, val2)
def print_statement(self, value):
return Print(value)
def if_statement(self, expr1, expr2, eval_expr):
return If(expr1, expr2, eval_expr)
def if_else_statement(self, expr1, expr2, eval_expr, else_statement):
return If(expr1, expr2, eval_expr, else_statement)
def if_statements(self, *values):
for value in values:
value.eval()
def statement(self, *values):
for value in values:
value.eval()
grammar = '''
?start: expr*
| statement* -> statement
| if* -> if_statements
?if : "if" expr "==" expr "{" statement+ "}" -> if_statement
| "if" expr "==" expr "{" expr+ "}" -> if_statement
| "if" expr "==" expr "{" statement+ "}" "else" "{" statement+ "}" -> if_else_statement
?statement: "print" "(" expr ")" ";" -> print_statement
| "input" "(" expr ")" ";" -> input_statement
| NAME "=" expr ";" -> assign_var
| NAME "=" "input" "(" expr ")" ";" -> var_input_statement
?expr: STRING -> string
| NUMBER -> number
| NAME -> get_var
%import common.ESCAPED_STRING -> STRING
%import common.NUMBER
%import common.CNAME -> NAME
%declare _INDENT _DEDENT
%import common.WS_INLINE
%ignore WS_INLINE
%import common.NEWLINE -> _NL
%ignore _NL
'''
class Print():
def __init__(self, value):
self.value = value
def eval(self):
return print(self.value.eval())
class Input():
def __init__(self, value):
self.value = value
def eval(self):
return input(self.value.eval())
class String():
def __init__(self, value):
self.value = str(value).strip('"')
def eval(self):
return self.value
class Integer():
def __init__(self, value):
self.value = int(value)
def eval(self):
return self.value
class If():
def __init__(self, expr1, expr2, eval_expr, else_statement=None):
self.expr1 = expr1
self.expr2 = expr2
self.eval_expr = eval_expr
self.else_statement = else_statement
def eval(self):
if self.expr1.eval() == self.expr2.eval():
return self.eval_expr.eval()
else:
if self.else_statement == None:
return
else:
return self.else_statement.eval()
parser = Lark(grammar, parser='lalr', postlex=MainIndenter())
test_input = '''
if 5 == 5 {
print("True");
}
else {
print("False");
}
print("Done");
'''
if __name__ == '__main__':
tree = parser.parse(test_input)
print(MainTransformer().transform(tree))
我对 lark 不熟悉,但这看起来不对:
?start: expr*
| statement* -> statement
| if* -> if_statements
这表示“将开始规则扩展到零个或多个 expr
、零个或多个 statements
或零个或多个 if
。这意味着您的语法不支持将三种产品混合在一起,就像您在尝试解析的源字符串中所做的那样。如果您以 if
开头,程序的其余部分必须全部是 if
s,所以像在 print("DONE");
中那样输入 statement
是被禁止的(错误消息说的一样多——它期待另一个 if
)。
您可以通过以下方式解决此问题:
?start: stmt*
?stmt: expr
| statement -> statement
| if -> if_statements
此语法表示“将开始规则扩展为零个或多个 stmt
,其中 stmt
定义为 expr
、statement
或 if
。通过这种方式,您可以混合搭配三种类型的作品。
撇开命名选择不当,在这个短期修复之后,语法还有其他明显的缺陷,比如无法支持嵌套的 if
块。由于您的最终目标不明确,我将避免假设并只关注您的直接问题。
所以我正在使用 python 和 lark
库制作一种编程语言进行解析。当我解析以下内容时
if 5 == 4 {
print("TRUE");
}
else {
print("FALSE");
}
print("Done!");
它引发了以下错误
PS E:\ParserAndLexer> & C:/Python38/python.exe e:/ParserAndLexer/lite/lite_transformer.py
Traceback (most recent call last):
File "C:\Python38\lib\site-packages\lark\lexer.py", line 416, in lex
for x in l.lex(stream, self.root_lexer.newline_types, self.root_lexer.ignore_types):
File "C:\Python38\lib\site-packages\lark\lexer.py", line 200, in lex
raise UnexpectedCharacters(stream, line_ctr.char_pos, line_ctr.line, line_ctr.column, allowed=allowed, state=self.state, token_history=last_token and [last_token])
lark.exceptions.UnexpectedCharacters: No terminal defined for 'p' at line 7 col 1
print("HI");
^
Expecting: {'IF'}
Previous tokens: Token('RBRACE', '}')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "e:/ParserAndLexer/lite/lite_transformer.py", line 80, in <module>
tree = parser.parse(lite_code)
File "C:\Python38\lib\site-packages\lark\lark.py", line 464, in parse
return self.parser.parse(text, start=start)
File "C:\Python38\lib\site-packages\lark\parser_frontends.py", line 148, in parse
return self._parse(token_stream, start, set_parser_state)
File "C:\Python38\lib\site-packages\lark\parser_frontends.py", line 63, in _parse
return self.parser.parse(input, start, *args)
File "C:\Python38\lib\site-packages\lark\parsers\lalr_parser.py", line 35, in parse
return self.parser.parse(*args)
File "C:\Python38\lib\site-packages\lark\parsers\lalr_parser.py", line 86, in parse
for token in stream:
File "C:\Python38\lib\site-packages\lark\indenter.py", line 32, in _process
for token in stream:
File "C:\Python38\lib\site-packages\lark\lexer.py", line 431, in lex
raise UnexpectedToken(t, e.allowed, state=e.state)
lark.exceptions.UnexpectedToken: Unexpected token Token('NAME', 'print') at line 7, column 1.
Expected one of:
* IF
我不明白为什么会这样,我的代码大概是这样的:
from lark import Lark, Transformer, v_args
from lark.indenter import Indenter
class MainIndenter(Indenter):
NL_type = '_NL'
OPEN_PAREN_types = ['LPAR', 'LBRACE']
CLOSE_PAREN_types = ['RPAR', 'RBRACE']
INDENT_TYPE = '_INDENT'
DEDENT_type = '_DEDENT'
tab_len = 8
@v_args(inline=True)
class MainTransformer(Transformer):
def __init__(self):
...
def number(self, value):
return Integer(value)
def string(self, value):
value = str(value).strip('"')
return String(value)
def div(self, val1, val2):
return Div(val1, val2)
def print_statement(self, value):
return Print(value)
def if_statement(self, expr1, expr2, eval_expr):
return If(expr1, expr2, eval_expr)
def if_else_statement(self, expr1, expr2, eval_expr, else_statement):
return If(expr1, expr2, eval_expr, else_statement)
def if_statements(self, *values):
for value in values:
value.eval()
def statement(self, *values):
for value in values:
value.eval()
grammar = '''
?start: expr*
| statement* -> statement
| if* -> if_statements
?if : "if" expr "==" expr "{" statement+ "}" -> if_statement
| "if" expr "==" expr "{" expr+ "}" -> if_statement
| "if" expr "==" expr "{" statement+ "}" "else" "{" statement+ "}" -> if_else_statement
?statement: "print" "(" expr ")" ";" -> print_statement
| "input" "(" expr ")" ";" -> input_statement
| NAME "=" expr ";" -> assign_var
| NAME "=" "input" "(" expr ")" ";" -> var_input_statement
?expr: STRING -> string
| NUMBER -> number
| NAME -> get_var
%import common.ESCAPED_STRING -> STRING
%import common.NUMBER
%import common.CNAME -> NAME
%declare _INDENT _DEDENT
%import common.WS_INLINE
%ignore WS_INLINE
%import common.NEWLINE -> _NL
%ignore _NL
'''
class Print():
def __init__(self, value):
self.value = value
def eval(self):
return print(self.value.eval())
class Input():
def __init__(self, value):
self.value = value
def eval(self):
return input(self.value.eval())
class String():
def __init__(self, value):
self.value = str(value).strip('"')
def eval(self):
return self.value
class Integer():
def __init__(self, value):
self.value = int(value)
def eval(self):
return self.value
class If():
def __init__(self, expr1, expr2, eval_expr, else_statement=None):
self.expr1 = expr1
self.expr2 = expr2
self.eval_expr = eval_expr
self.else_statement = else_statement
def eval(self):
if self.expr1.eval() == self.expr2.eval():
return self.eval_expr.eval()
else:
if self.else_statement == None:
return
else:
return self.else_statement.eval()
parser = Lark(grammar, parser='lalr', postlex=MainIndenter())
test_input = '''
if 5 == 5 {
print("True");
}
else {
print("False");
}
print("Done");
'''
if __name__ == '__main__':
tree = parser.parse(test_input)
print(MainTransformer().transform(tree))
我对 lark 不熟悉,但这看起来不对:
?start: expr*
| statement* -> statement
| if* -> if_statements
这表示“将开始规则扩展到零个或多个 expr
、零个或多个 statements
或零个或多个 if
。这意味着您的语法不支持将三种产品混合在一起,就像您在尝试解析的源字符串中所做的那样。如果您以 if
开头,程序的其余部分必须全部是 if
s,所以像在 print("DONE");
中那样输入 statement
是被禁止的(错误消息说的一样多——它期待另一个 if
)。
您可以通过以下方式解决此问题:
?start: stmt*
?stmt: expr
| statement -> statement
| if -> if_statements
此语法表示“将开始规则扩展为零个或多个 stmt
,其中 stmt
定义为 expr
、statement
或 if
。通过这种方式,您可以混合搭配三种类型的作品。
撇开命名选择不当,在这个短期修复之后,语法还有其他明显的缺陷,比如无法支持嵌套的 if
块。由于您的最终目标不明确,我将避免假设并只关注您的直接问题。