在 java 中使用正则表达式将模板标签替换为速度
Replace template tags with velocity using regex in java
我需要用 Velocity 代码替换 HTML 文档中的一些自定义模板标签,以便输出可以在另一个应用程序中使用。
我收到的代码如下所示:
<BG SWITCH>param
<BG CASE>value1:
text 1
<BG /CASE>
<BG CASE>value2:
text 2
<BG /CASE>
<BG CASE>value3:
text 3
<BG /CASE>
<BG CASE>value4:
text 4
<BG /CASE>
<BG /SWITCH>
结果应该是这样的:
#set ($param = $!user.data.param)
#if($!param == "value1")
text 1
#end
#if($!param == "value2")
text 2
#end
#if($!param == "value3")
text 3
#end
#if($!param == "value4")
text 4
#end
我现在的 Java 代码是这样的:
Pattern pattern = Pattern.compile("<BG SWITCH>(.+?)<BG CASE>(.+?):(.+?)<BG /CASE>", Pattern.DOTALL);
Matcher matcher = pattern.matcher(newHtml);
newHtml = matcher.replaceAll("#set \(\$\ = \$!user.data.) #if(\$ == \"\") #end");
我得到的结果是这样的:
#set ($param = $!user.data.param)
#if($param == "value1")
text 1
#end
<BG CASE>value2:
text 2
<BG /CASE>
<BG CASE>value3:
text 3
<BG /CASE>
<BG CASE>value4:
text 4
<BG /CASE>
<BG /SWITCH>
所以 "switch" 和第一个 "case" 被正确替换,但我不知道如何替换以下情况。
有什么想法吗?
在 svasa 回答后编辑:
建议的解决方案非常适合我提供的示例代码,但我应该提到那只是要处理的实际文件中的一个片段。实际文件可能包含多个这样的 switch-case 块。在那种情况下,建议的代码并没有完全涵盖它。
当 "source" 看起来像这样时:
<div>
Some other content before
</div>
<div>
<BG SWITCH>param1
<BG CASE>value1:
text 1
<BG /CASE>
<BG CASE>value2:
text 2
<BG /CASE>
<BG CASE>value3:
text 3
<BG /CASE>
<BG CASE>value4:
text 4
<BG /CASE>
<BG /SWITCH>
</div>
<div>
Some other content between
</div>
<div>
<BG SWITCH>param2
<BG CASE>value1:
text 1
<BG /CASE>
<BG CASE>value2:
text 2
<BG /CASE>
<BG CASE>value3:
text 3
<BG /CASE>
<BG CASE>value4:
text 4
<BG /CASE>
<BG /SWITCH>
</div>
<div>
Some other content after
</div>
然后我得到以下结果:
#set ($param1 = $!user.data.param1)
#if($!param1== "value1")
text 1
#end
#if($!param1== "value2")
text 2
#end
#if($!param1== "value3")
text 3
#end
#if($!param1== "value4")
text 4
#end
#if($!param1== "value1")
text 1
#end
#if($!param1== "value2")
text 2
#end
#if($!param1== "value3")
text 3
#end
#if($!param1== "value4")
text 4
#end
所以 switch-case 块之前、之间和之后的任何其他内容都将丢失,第一个 "param" 用于第二个 switch-case 块。我将如何解决这个问题?
Before proceeding to look into the solution I wrote below - Your input
string looks more like an xml but it is not, because the strings like
BG /SWITCH
is not valid xml. Do you know this is correct input for you ?
All other elements and the structure of the string looks like xml.
Confirm with the source that it is an xml or not. If it is an xml, it
is a breeze to extract whatever you want with Java's inbuilt
xpath.Otherwise it is excruciatingly painful to extract desired strings with regex as I did below for you.
步骤:
首先使用正则表达式得到想要的 <div>
:
(<BG SWITCH>(.*)\n\s*(\s*<BG CASE>(.*):\n\s*(.*)\n\s*<BG \/CASE>)+\n\s*<BG \/SWITCH>)
此正则表达式的演示是 here
- 将找到的字符串传递给我在之前的回答中使用的正则表达式:
我会先获取顶行,即 param
然后是剩余的字符串 value1:
、 text 1
等
对于 param
,正则表达式为:<BG SWITCH>(.*)
对于 value
、text
字符串,您可以使用正则表达式:
<BG CASE>(.*)\n?(\s+(.*))\n?\s+<BG /CASE>
这个正则表达式的演示是 here
代码如下所示:
String mainRegex = "(<BG SWITCH>(.*)\n\s*(\s*<BG CASE>(.*):\n\s*(.*)\n\s*<BG \/CASE>)+\n\s*<BG \/SWITCH>)";
String result = "";
Pattern p = Pattern.compile( mainRegex );
Matcher m = p.matcher( sb.toString() );
while ( m.find() )
{
String toSearch = m.group();
result += updateResult( result, toSearch );
}
System.out.println(result);
private static String updateResult( String replaced, String searchString )
{
Pattern firstPattern = Pattern.compile("<BG SWITCH>(.*)");
Matcher matcher = firstPattern.matcher(searchString);
if ( matcher.find() )
{
String paramtext = matcher.group(1);
replaced = "#set ($" + paramtext + " = $!user.data." + paramtext + ")";
replaced = replaced + "\n";
Pattern pattern = Pattern.compile("<BG CASE>(.*):\n?(\s+(.*))\n?\s+<BG /CASE>");
matcher = pattern.matcher(searchString);
while ( matcher.find() )
{
String valueString = matcher.group(1);
String textString = matcher.group(2);
replaced = replaced + "#if($!" + paramtext + "== \"" + valueString + "\")" + "\n" + textString + "\n#end" + "\n";
}
}
return replaced;
}
最后一个 println
语句的输出:
#set ($param1 = $!user.data.param1)
#if($!param1== "value1")
text 1
#end
#if($!param1== "value2")
text 2
#end
#if($!param1== "value3")
text 3
#end
#if($!param1== "value4")
text 4
#end
#set ($param2 = $!user.data.param2)
#if($!param2== "value1")
text 1
#end
#if($!param2== "value2")
text 2
#end
#if($!param2== "value3")
text 3
#end
#if($!param2== "value4")
text 4
#end
我需要用 Velocity 代码替换 HTML 文档中的一些自定义模板标签,以便输出可以在另一个应用程序中使用。
我收到的代码如下所示:
<BG SWITCH>param
<BG CASE>value1:
text 1
<BG /CASE>
<BG CASE>value2:
text 2
<BG /CASE>
<BG CASE>value3:
text 3
<BG /CASE>
<BG CASE>value4:
text 4
<BG /CASE>
<BG /SWITCH>
结果应该是这样的:
#set ($param = $!user.data.param)
#if($!param == "value1")
text 1
#end
#if($!param == "value2")
text 2
#end
#if($!param == "value3")
text 3
#end
#if($!param == "value4")
text 4
#end
我现在的 Java 代码是这样的:
Pattern pattern = Pattern.compile("<BG SWITCH>(.+?)<BG CASE>(.+?):(.+?)<BG /CASE>", Pattern.DOTALL);
Matcher matcher = pattern.matcher(newHtml);
newHtml = matcher.replaceAll("#set \(\$\ = \$!user.data.) #if(\$ == \"\") #end");
我得到的结果是这样的:
#set ($param = $!user.data.param)
#if($param == "value1")
text 1
#end
<BG CASE>value2:
text 2
<BG /CASE>
<BG CASE>value3:
text 3
<BG /CASE>
<BG CASE>value4:
text 4
<BG /CASE>
<BG /SWITCH>
所以 "switch" 和第一个 "case" 被正确替换,但我不知道如何替换以下情况。
有什么想法吗?
在 svasa 回答后编辑:
建议的解决方案非常适合我提供的示例代码,但我应该提到那只是要处理的实际文件中的一个片段。实际文件可能包含多个这样的 switch-case 块。在那种情况下,建议的代码并没有完全涵盖它。
当 "source" 看起来像这样时:
<div>
Some other content before
</div>
<div>
<BG SWITCH>param1
<BG CASE>value1:
text 1
<BG /CASE>
<BG CASE>value2:
text 2
<BG /CASE>
<BG CASE>value3:
text 3
<BG /CASE>
<BG CASE>value4:
text 4
<BG /CASE>
<BG /SWITCH>
</div>
<div>
Some other content between
</div>
<div>
<BG SWITCH>param2
<BG CASE>value1:
text 1
<BG /CASE>
<BG CASE>value2:
text 2
<BG /CASE>
<BG CASE>value3:
text 3
<BG /CASE>
<BG CASE>value4:
text 4
<BG /CASE>
<BG /SWITCH>
</div>
<div>
Some other content after
</div>
然后我得到以下结果:
#set ($param1 = $!user.data.param1)
#if($!param1== "value1")
text 1
#end
#if($!param1== "value2")
text 2
#end
#if($!param1== "value3")
text 3
#end
#if($!param1== "value4")
text 4
#end
#if($!param1== "value1")
text 1
#end
#if($!param1== "value2")
text 2
#end
#if($!param1== "value3")
text 3
#end
#if($!param1== "value4")
text 4
#end
所以 switch-case 块之前、之间和之后的任何其他内容都将丢失,第一个 "param" 用于第二个 switch-case 块。我将如何解决这个问题?
Before proceeding to look into the solution I wrote below - Your input string looks more like an xml but it is not, because the strings like
BG /SWITCH
is not valid xml. Do you know this is correct input for you ? All other elements and the structure of the string looks like xml. Confirm with the source that it is an xml or not. If it is an xml, it is a breeze to extract whatever you want with Java's inbuilt xpath.Otherwise it is excruciatingly painful to extract desired strings with regex as I did below for you.
步骤:
首先使用正则表达式得到想要的
<div>
:(<BG SWITCH>(.*)\n\s*(\s*<BG CASE>(.*):\n\s*(.*)\n\s*<BG \/CASE>)+\n\s*<BG \/SWITCH>)
此正则表达式的演示是 here
- 将找到的字符串传递给我在之前的回答中使用的正则表达式:
我会先获取顶行,即 param
然后是剩余的字符串 value1:
、 text 1
等
对于 param
,正则表达式为:<BG SWITCH>(.*)
对于 value
、text
字符串,您可以使用正则表达式:
<BG CASE>(.*)\n?(\s+(.*))\n?\s+<BG /CASE>
这个正则表达式的演示是 here
代码如下所示:
String mainRegex = "(<BG SWITCH>(.*)\n\s*(\s*<BG CASE>(.*):\n\s*(.*)\n\s*<BG \/CASE>)+\n\s*<BG \/SWITCH>)";
String result = "";
Pattern p = Pattern.compile( mainRegex );
Matcher m = p.matcher( sb.toString() );
while ( m.find() )
{
String toSearch = m.group();
result += updateResult( result, toSearch );
}
System.out.println(result);
private static String updateResult( String replaced, String searchString )
{
Pattern firstPattern = Pattern.compile("<BG SWITCH>(.*)");
Matcher matcher = firstPattern.matcher(searchString);
if ( matcher.find() )
{
String paramtext = matcher.group(1);
replaced = "#set ($" + paramtext + " = $!user.data." + paramtext + ")";
replaced = replaced + "\n";
Pattern pattern = Pattern.compile("<BG CASE>(.*):\n?(\s+(.*))\n?\s+<BG /CASE>");
matcher = pattern.matcher(searchString);
while ( matcher.find() )
{
String valueString = matcher.group(1);
String textString = matcher.group(2);
replaced = replaced + "#if($!" + paramtext + "== \"" + valueString + "\")" + "\n" + textString + "\n#end" + "\n";
}
}
return replaced;
}
最后一个 println
语句的输出:
#set ($param1 = $!user.data.param1)
#if($!param1== "value1")
text 1
#end
#if($!param1== "value2")
text 2
#end
#if($!param1== "value3")
text 3
#end
#if($!param1== "value4")
text 4
#end
#set ($param2 = $!user.data.param2)
#if($!param2== "value1")
text 1
#end
#if($!param2== "value2")
text 2
#end
#if($!param2== "value3")
text 3
#end
#if($!param2== "value4")
text 4
#end