使用 Python/Regular 表达式优化字符串

Question

请帮助我使用 python/regex 优化此字符串。它也有大的空格。

/**
         * this is comment                this is comment
         * this is comment
         * <blank line>
         *      this is comment
         * this is comment
         * <blank line>
         * this is comment
         */

如何通过删除/**, *

获得纯文本

我希望输出字符串应该是：

这是评论
这是评论
这是评论
这是评论
这是评论

Answer 1

现在很明显，OP 预计该评论 this is comment 六次，因此我建议使用此正则表达式，

^[ /*]+\n?| {2,}(.*(\n))

并替换为</code>。 <a href="https://regex101.com/r/biB7y2/4" rel="nofollow noreferrer">Demo</a> 此外，您确实不需要三个单独的正则表达式（如其他公认的答案）来实现此目的，而只需使用一个正则表达式即可完成。 这是一个Python代码演示， <pre><code>import re s = '''/** * this is comment this is comment * this is comment * * this is comment * this is comment * * this is comment */''' print(re.sub(r'(?m)^[ /*]+\n?| {2,}(.*(\n))', r'', s))

打印以下内容并注意我已经按照 FailSafe 的建议在正则表达式之前使用 (?m) 启用了多行模式，非常感谢他的建议，因为它在其他方面并不明显，

this is comment
this is comment
this is comment
this is comment
this is comment
this is comment

如果您需要对我的回答中的任何部分进行解释，请告诉我。

Answer 2

您可以使用 RegEx 模块中的 sub() 函数来匹配不需要的字符并格式化输入字符串。这是一个概念证明，可以提供您想要的输出。您可以在这里进行测试：https://repl.it/@glhr/regex-fun

import re

inputStr = """/**
         * this is comment                this is comment
         * this is comment
         * 
         *      this is comment
         * this is comment
         * 
         * this is comment
         */"""

formattedStr = re.sub("[*/]", "", inputStr) # comments
formattedStr = re.sub("\n\s{2,}|\s{2,}", "\n", formattedStr) # extra whitespaces
formattedStr = re.sub("^\n+|\n+$|\n{2,}", "", formattedStr) # extra blank lines
print(formattedStr)

您可以在 https://regexr.com/

等网站上试验正则表达式

使用 Python/Regular 表达式优化字符串

Refine String using Python/Regular Expression

python

regex

regular-language

python-3.x