使用 ANTLR 4 在 Python 3 中生成 Java 解析器
Generating a Java Parser in Python 3 using ANTLR 4
从这里使用 Lexer 和 Parser:
https://raw.githubusercontent.com/antlr/grammars-v4/master/java/JavaLexer.g4
https://raw.githubusercontent.com/antlr/grammars-v4/master/java/JavaParser.g4
用antlr-4.6生成Python3个目标
java -jar ./antlr-4.6-complete.jar -Dlanguage=Python3 ./JavaLexer.g4
java -jar ./antlr-4.6-complete.jar -Dlanguage=Python3 ./JavaParser.g4
但是,我无法 运行 生成的解析器上的 compilationUnit() 方法。它错误地说
ipdb> parser.compilationUnit()
File "/home/sviyer/onmt-fresh/java/JavaParser.py", line 1063, in compilationUnit
localctx = JavaParser.CompilationUnitContext(self, self._ctx, self.state)
File "/home/sviyer/.conda/envs/allennlp/lib/python3.6/site-packages/antlr4/error/ErrorStrategy.py", line 223, in sync
raise InputMismatchException(recognizer)
antlr4.error.Errors.InputMismatchException: None
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "TestAntlr.py", line 13, in <module>
parser.compilationUnit()
File "/home/sviyer/onmt-fresh/java/JavaParser.py", line 1063, in compilationUnit
localctx = JavaParser.CompilationUnitContext(self, self._ctx, self.state)
File "/home/sviyer/.conda/envs/allennlp/lib/python3.6/site-packages/antlr4/error/ErrorStrategy.py", line 126, in reportError
self.reportInputMismatch(recognizer, e)
File "/home/sviyer/.conda/envs/allennlp/lib/python3.6/site-packages/antlr4/error/ErrorStrategy.py", line 266, in reportInputMismatch
+ " expecting " + e.getExpectedTokens().toString(recognizer.literalNames, recognizer.symbolicNames)
File "/home/sviyer/.conda/envs/allennlp/lib/python3.6/site-packages/antlr4/error/ErrorStrategy.py", line 522, in getTokenErrorDisplay
s = t.text
AttributeError: 'int' object has no attribute 'text'
虽然 Lexer 工作正常并且解析器解析了它。我的代码是:
stream = antlr4.InputStream(code)
lexer = JavaLexer(stream)
toks = antlr4.CommonTokenStream(lexer)
parser = JavaParser(stream)
您的代码不正确。试试这个:
code = open('sample.java', 'r').read()
codeStream = InputStream(code)
lexer = JavaLexer(codeStream)
# First lexing way
tokensStream = CommonTokenStream(lexer)
parser = JavaParser(tokensStream)
# Second lexing way
'''tokens = lexer.getAllTokens()
tokensSource = ListTokenSource(tokens)
tokensStream = CommonTokenStream(tokensSource)
parser = JavaParser(tokensStream)'''
tree = parser.compilationUnit()
print "Tree " + tree.toStringTree(recog=parser);
另外,使用最新稳定的ANTLR版本(4.7)。
从这里使用 Lexer 和 Parser:
https://raw.githubusercontent.com/antlr/grammars-v4/master/java/JavaLexer.g4
https://raw.githubusercontent.com/antlr/grammars-v4/master/java/JavaParser.g4
用antlr-4.6生成Python3个目标
java -jar ./antlr-4.6-complete.jar -Dlanguage=Python3 ./JavaLexer.g4
java -jar ./antlr-4.6-complete.jar -Dlanguage=Python3 ./JavaParser.g4
但是,我无法 运行 生成的解析器上的 compilationUnit() 方法。它错误地说
ipdb> parser.compilationUnit()
File "/home/sviyer/onmt-fresh/java/JavaParser.py", line 1063, in compilationUnit
localctx = JavaParser.CompilationUnitContext(self, self._ctx, self.state)
File "/home/sviyer/.conda/envs/allennlp/lib/python3.6/site-packages/antlr4/error/ErrorStrategy.py", line 223, in sync
raise InputMismatchException(recognizer)
antlr4.error.Errors.InputMismatchException: None
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "TestAntlr.py", line 13, in <module>
parser.compilationUnit()
File "/home/sviyer/onmt-fresh/java/JavaParser.py", line 1063, in compilationUnit
localctx = JavaParser.CompilationUnitContext(self, self._ctx, self.state)
File "/home/sviyer/.conda/envs/allennlp/lib/python3.6/site-packages/antlr4/error/ErrorStrategy.py", line 126, in reportError
self.reportInputMismatch(recognizer, e)
File "/home/sviyer/.conda/envs/allennlp/lib/python3.6/site-packages/antlr4/error/ErrorStrategy.py", line 266, in reportInputMismatch
+ " expecting " + e.getExpectedTokens().toString(recognizer.literalNames, recognizer.symbolicNames)
File "/home/sviyer/.conda/envs/allennlp/lib/python3.6/site-packages/antlr4/error/ErrorStrategy.py", line 522, in getTokenErrorDisplay
s = t.text
AttributeError: 'int' object has no attribute 'text'
虽然 Lexer 工作正常并且解析器解析了它。我的代码是:
stream = antlr4.InputStream(code)
lexer = JavaLexer(stream)
toks = antlr4.CommonTokenStream(lexer)
parser = JavaParser(stream)
您的代码不正确。试试这个:
code = open('sample.java', 'r').read()
codeStream = InputStream(code)
lexer = JavaLexer(codeStream)
# First lexing way
tokensStream = CommonTokenStream(lexer)
parser = JavaParser(tokensStream)
# Second lexing way
'''tokens = lexer.getAllTokens()
tokensSource = ListTokenSource(tokens)
tokensStream = CommonTokenStream(tokensSource)
parser = JavaParser(tokensStream)'''
tree = parser.compilationUnit()
print "Tree " + tree.toStringTree(recog=parser);
另外,使用最新稳定的ANTLR版本(4.7)。