Jsoup.clean() 保持未关闭状态并打开标签
Jsoup.clean() leaves unclosed and opens tags
以下代码替换此文本:<br />
为 <br>
:
String removeDisallowedTags(String textToEscape) {
Whitelist whitelist = Whitelist.none();
whitelist.addTags(new String[] { "b", "br", "font" });
String safe = Jsoup.clean(textToEscape, whitelist);
return safe;
}
为什么?
Jsoup.clean()
默认将文档处理为HTML,在HTML中<br>
允许没有结束标签。 <img>
.
也是如此
您必须将代码解析为 XML。这将使标签关闭 - 它甚至会为您关闭它们。带有一些额外设置的固定方法:
String cleanXmlAndRemoveUnwantedTags(String textToEscape) {
Whitelist whitelist = Whitelist.none();
whitelist.addTags(allowedTags);
OutputSettings outputSettings = new OutputSettings()
.syntax(OutputSettings.Syntax.xml)
.charset(StandardCharsets.UTF_8)
.prettyPrint(false);
String safe = Jsoup.clean(textToEscape, "", whitelist, outputSettings);
return safe;
}
以下代码替换此文本:<br />
为 <br>
:
String removeDisallowedTags(String textToEscape) {
Whitelist whitelist = Whitelist.none();
whitelist.addTags(new String[] { "b", "br", "font" });
String safe = Jsoup.clean(textToEscape, whitelist);
return safe;
}
为什么?
Jsoup.clean()
默认将文档处理为HTML,在HTML中<br>
允许没有结束标签。 <img>
.
您必须将代码解析为 XML。这将使标签关闭 - 它甚至会为您关闭它们。带有一些额外设置的固定方法:
String cleanXmlAndRemoveUnwantedTags(String textToEscape) {
Whitelist whitelist = Whitelist.none();
whitelist.addTags(allowedTags);
OutputSettings outputSettings = new OutputSettings()
.syntax(OutputSettings.Syntax.xml)
.charset(StandardCharsets.UTF_8)
.prettyPrint(false);
String safe = Jsoup.clean(textToEscape, "", whitelist, outputSettings);
return safe;
}