我想使用 python 脚本从 java 文件中捕获评论
I want to capture comments from java file using python script
出于文档目的,我想捕获位于其代码上方的每个函数的注释。
我能够将文件迭代到它们的函数名称。一旦我得到函数名称行,我就想捕获它上面的注释。
注释在'/** xxx */'块
/**
* this is the comment
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.STRING.class)
String RESPONSE_TEXT = "responseText";
/**
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.LONG.class)
String TIME = "clTimestamp";
这需要工作:
data = open(file_name).read()
data = data.split('/**')
old = data
data = list()
for i in old:
data.extend(old.split('*/'))
comments = []
for i in range(1, len(data), 2):
comments.append(data[i])
for k in comments:
print(k)
现在,当我知道函数名称行以 @Attribute
开头时,使用正则表达式(re
模块)可以很容易地完成它,这可以通过以下方式完成:
import re
content = '''
/**
* this is the comment
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.STRING.class)
String RESPONSE_TEXT = "responseText";
/**
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.LONG.class)
String TIME = "clTimestamp";
'''
comments = re.findall(r'(/\*\*.*?\*/)\n(@Attribute[^\n]*)',content,re.DOTALL)
print('Function comments:')
for i in comments:
print(i[1])
print(i[0])
print('\n')
输出:
Function comments
@Attribute(type = Attribute.STRING.class)
/**
* this is the comment
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.LONG.class)
/**
* this is the comment
* this is the comment
*/
为了清楚起见,我硬编码了 content
,我使用 re.findall
和有两组的模式,一组用于注释,第二组用于名称,因此它给出 list
of 2-tuple
s,每个由注释和函数名组成。注意 re.DOTALL
意思是 .*?
可能会给出多行匹配和具有特殊含义的字符转义,即 *
as \*
.
x = find_comment(x, "/*", "*/", 2)
x = find_comment(x, "//", "\n", 0)
def find_comment(n_array, start_string, end_string, add_index):
comment_index = n_array.find(start_string)
if comment_index != -1:
comment_end_index = n_array.find(end_string, comment_index)
print(comment_end_index)
if len(n_array) > comment_end_index:
print(n_array[comment_index:comment_end_index + add_index])
n_array = n_array[0: comment_index:] + n_array[comment_end_index + add_index::]
find_comment(n_array, start_string, end_string, add_index)
return n_array
return n_array
出于文档目的,我想捕获位于其代码上方的每个函数的注释。
我能够将文件迭代到它们的函数名称。一旦我得到函数名称行,我就想捕获它上面的注释。 注释在'/** xxx */'块
/**
* this is the comment
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.STRING.class)
String RESPONSE_TEXT = "responseText";
/**
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.LONG.class)
String TIME = "clTimestamp";
这需要工作:
data = open(file_name).read()
data = data.split('/**')
old = data
data = list()
for i in old:
data.extend(old.split('*/'))
comments = []
for i in range(1, len(data), 2):
comments.append(data[i])
for k in comments:
print(k)
现在,当我知道函数名称行以 @Attribute
开头时,使用正则表达式(re
模块)可以很容易地完成它,这可以通过以下方式完成:
import re
content = '''
/**
* this is the comment
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.STRING.class)
String RESPONSE_TEXT = "responseText";
/**
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.LONG.class)
String TIME = "clTimestamp";
'''
comments = re.findall(r'(/\*\*.*?\*/)\n(@Attribute[^\n]*)',content,re.DOTALL)
print('Function comments:')
for i in comments:
print(i[1])
print(i[0])
print('\n')
输出:
Function comments
@Attribute(type = Attribute.STRING.class)
/**
* this is the comment
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.LONG.class)
/**
* this is the comment
* this is the comment
*/
为了清楚起见,我硬编码了 content
,我使用 re.findall
和有两组的模式,一组用于注释,第二组用于名称,因此它给出 list
of 2-tuple
s,每个由注释和函数名组成。注意 re.DOTALL
意思是 .*?
可能会给出多行匹配和具有特殊含义的字符转义,即 *
as \*
.
x = find_comment(x, "/*", "*/", 2)
x = find_comment(x, "//", "\n", 0)
def find_comment(n_array, start_string, end_string, add_index):
comment_index = n_array.find(start_string)
if comment_index != -1:
comment_end_index = n_array.find(end_string, comment_index)
print(comment_end_index)
if len(n_array) > comment_end_index:
print(n_array[comment_index:comment_end_index + add_index])
n_array = n_array[0: comment_index:] + n_array[comment_end_index + add_index::]
find_comment(n_array, start_string, end_string, add_index)
return n_array
return n_array