sre_constants.error: nothing to repeat in jython
sre_constants.error: nothing to repeat in jython
我有html内容
我想从这个内容中获得评论
content = """<html>
<body>
<!--<h1>test</h1>-->
<!--<div>
<img src='x'>
</div>-->
Blockquote
<!--
<div>
<img src='xe'>
</div>
-->
</body>
</html>"""
我使用这个正则表达式
regex_str = "<!--((\n|\r)+)?((.*?)+((\n|\r)+)?)+-->"
当运行这一行在Python
re.findall(regex_str,content)
运行成功
但是当 运行 在 jython
出现这个错误
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/share/java/jython-2.7.2.jar/Lib/re$py.class", line 177, in findall
File "/usr/share/java/jython-2.7.2.jar/Lib/re$py.class", line 242, in _compile
sre_constants.error: nothing to repeat
使用
<!--[\n\r]*([\w\W]*?)[\n\r]*-->
参见regex proof。
解释
--------------------------------------------------------------------------------
<!-- '<!--'
--------------------------------------------------------------------------------
[\n\r]* any character of: '\n' (newline), '\r'
(carriage return) (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
( group and capture to :
--------------------------------------------------------------------------------
[\w\W]*? any character of: word characters (a-z,
A-Z, 0-9, _), non-word characters (all
but a-z, A-Z, 0-9, _) (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
) end of
--------------------------------------------------------------------------------
[\n\r]* any character of: '\n' (newline), '\r'
(carriage return) (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
--> '-->'
我有html内容 我想从这个内容中获得评论
content = """<html>
<body>
<!--<h1>test</h1>-->
<!--<div>
<img src='x'>
</div>-->
Blockquote
<!--
<div>
<img src='xe'>
</div>
-->
</body>
</html>"""
我使用这个正则表达式
regex_str = "<!--((\n|\r)+)?((.*?)+((\n|\r)+)?)+-->"
当运行这一行在Python
re.findall(regex_str,content)
运行成功
但是当 运行 在 jython
出现这个错误
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/share/java/jython-2.7.2.jar/Lib/re$py.class", line 177, in findall
File "/usr/share/java/jython-2.7.2.jar/Lib/re$py.class", line 242, in _compile
sre_constants.error: nothing to repeat
使用
<!--[\n\r]*([\w\W]*?)[\n\r]*-->
参见regex proof。
解释
--------------------------------------------------------------------------------
<!-- '<!--'
--------------------------------------------------------------------------------
[\n\r]* any character of: '\n' (newline), '\r'
(carriage return) (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
( group and capture to :
--------------------------------------------------------------------------------
[\w\W]*? any character of: word characters (a-z,
A-Z, 0-9, _), non-word characters (all
but a-z, A-Z, 0-9, _) (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
) end of
--------------------------------------------------------------------------------
[\n\r]* any character of: '\n' (newline), '\r'
(carriage return) (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
--> '-->'