如果 java 中的 XML 中不存在节点或子节点,如何将 XPath return 空字符串
How to XPath return empty string if node or child node is not present in XML in java
我有一个 XML 文件作为 "sample.xml" 并且有 4 条记录 .
<?xml version='1.0' encoding='UTF-8'?>
<hello xmlns:show="http://www.example.com" xmlns:css="http://www.example.com" xml_version="2.0">
<entry id="2008-0001">
<show:id>2008-0001</show:id>
<show:published-datetime>2008-01-15T15:00:00.000-05:00</show:published-datetime>
<show:last-modified-datetime>2012-03-19T00:00:00.000-04:00</show:last-modified-datetime>
<show:css>
<css:metrics>
<css:score>3.6</css:score>
<css:access-vector>LOCAL</css:access-vector>
<css:authentication>NONE</css:authentication>
<css:generated-on-datetime>2008-01-15T15:22:00.000-05:00</css:generated-on-datetime>
</css:metrics>
</show:css>
<show:summary>This is first entry.</show:summary>
</entry>
<entry id="2008-0002">
<show:id>2008-0002</show:id>
<show:published-datetime>2008-02-11T20:00:00.000-05:00</show:published-datetime>
<show:last-modified-datetime>2014-03-15T23:22:37.303-04:00</show:last-modified-datetime>
<show:css>
<css:metrics>
<css:score>5.8</css:score>
<css:access-vector>NETWORK</css:access-vector>
<css:authentication>NONE</css:authentication>
<css:generated-on-datetime>2008-02-12T10:12:00.000-05:00</css:generated-on-datetime>
</css:metrics>
</show:css>
<show:summary>This is second entry.</show:summary>
</entry>
<entry id="2008-0003">
<show:id>2008-0003</show:id>
<show:published-datetime>2009-03-26T06:12:08.780-04:00</show:published-datetime>
<show:last-modified-datetime>2009-03-26T06:12:09.313-04:00</show:last-modified-datetime>
<show:summary>This is 3rd entry with missing "css" tag and their metrics.</show:summary>
</entry>
<entry id="2008-0004">
<show:id>CVE-2008-0004</show:id>
<show:published-datetime>2008-01-11T19:46:00.000-05:00</show:published-datetime>
<show:last-modified-datetime>2011-09-06T22:41:45.753-04:00</show:last-modified-datetime>
<show:css>
<css:metrics>
<css:score>4.3</css:score>
<css:access-vector>NETWORK</css:access-vector>
<css:authentication>NONE</css:authentication>
<css:generated-on-datetime>2008-01-14T09:37:00.000-05:00</css:generated-on-datetime>
</css:metrics>
</show:css>
<show:summary>This is 4th entry.</show:summary>
</entry>
</hello>
和 1 个 Java 文件为 "Test.java" -
import java.io.File;
import java.util.ArrayList;
import java.util.List;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class Test {
public static void main(String[] args) {
List<String> list = new ArrayList<String>();
File fXmlFile = new File("/home/ankit/sample.xml");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
try
{
DocumentBuilder dBuilder = factory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
NodeList nList = doc.getElementsByTagName("entry");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
for (int i = 0; i < nList.getLength(); i++)
{
XPathExpression expr1 = xpath.compile("//hello/entry/css/metrics/score");
NodeList nodeList1 = (NodeList) expr1.evaluate(doc, XPathConstants.NODESET);
if(nodeList1.item(i)!=null)
{
Node currentItem = nodeList1.item(i);
if(!currentItem.getTextContent().isEmpty())
{
list.add(currentItem.getTextContent());
}
}
}
}
catch(Exception e)
{
e.printStackTrace();
}
System.out.println("size----"+list.size());
for(int i=0;i<list.size();i++)
{
System.out.println("list----"+list.get(i));
}
}
}
我需要从 XML 读取 <entry>
标签,为此我正在使用 XPath 。在 XML 文件中有 4 个条目标签,在条目标签内部有 <show:css>
标签,但是在第 3 个 <entry>
标签中,这个 <show:css>
标签丢失了,把那些 css 标签在列表中的得分值。因此,当我 运行 时,此 java 代码将前 2 个值存储在列表中,并在第 3 个位置存储第 4 个标签的 css 分值。
我想要一个列表作为输出,其中第一个、第二个和第四个元素为“3.6”、“4.8”和“5.3”,第三个元素应该是空字符串或 nill。但是我在列表中只得到 3 个元素,值为 1,2 和 4。
我需要将空字符串“”放在第三位,原始值放在第四位。表示如果该标签不存在,则将空白值放入列表中。
当前输出 - [“3.6”、“4.8”、“5.3”]
我预计 - [“3.6”、“4.8”、“”、“5.3”]
谁能帮我解决这个问题。
可能有几种方法可以实现...
我的基本做法是找到所有有 css/metrics/score
子节点和没有子节点的 entry
节点(你可能只得到所有 entry
节点, 但这证明了查询语言的强大功能)
类似...
XPathExpression expr1 = xPath.compile("//hello/entry[css/metrics/score or not(css/metrics/score)]");
我知道条件表达式意义不大,我希望 OP 看到他们可以使用额外的条件来扩展那里的要求,谢谢大家指出,尽管我已经做了提一下...希望我们都能继续前进
然后,遍历生成的 NodeList
并查询每个 entry
Node
的 css/metrics/score
节点。如果它是 null
,则将 null
值添加到列表中(或您想要的任何其他值),例如...
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
Document doc = dbf.newDocumentBuilder().parse(JavaApplication908.class.getResourceAsStream("/Hello.xml"));
XPathFactory xf = XPathFactory.newInstance();
XPath xPath = xf.newXPath();
XPathExpression expr1 = xPath.compile("//hello/entry[css/metrics/score or not(css/metrics/score)]");
XPathExpression expr2 = xPath.compile("css/metrics/score");
List<String> values = new ArrayList<>(25);
NodeList nodeList1 = (NodeList) expr1.evaluate(doc, XPathConstants.NODESET);
for (int index = 0; index < nodeList1.getLength(); index++) {
Node node = nodeList1.item(index);
System.out.println(node.getAttributes().getNamedItem("id"));
Node css = (Node) expr2.evaluate(node, XPathConstants.NODE);
if (css != null) {
values.add(css.getTextContent());
} else {
values.add(null);
}
}
for (String value : values) {
System.out.println(value);
}
这输出...
id="2008-0001"
id="2008-0002"
id="2008-0003"
id="2008-0004"
3.6
5.8
null
4.3
(前四行是entry
节点id
s,后四行是结果css/metrics/score
值)
我不是 XPath 专家,但通过查看您的代码,我认为您只是缺少几行代码,
if(nodeList1.item(i)!=null)
{
Node currentItem = nodeList1.item(i);
if(!currentItem.getTextContent().isEmpty())
{
list.add(currentItem.getTextContent());
}
else
list.add("");
}
else
list.add("");
@MathiasMüller could you please let me know how it can be done in 1 expression in XPath 2.0. – ankit
等效的 XPath 2.0 表达式为
for $x in //entry return (if ($x//*:score) then $x//*:score else '')
它大量使用了 XPath 2.0 中引入的新结构。输出将是
3.6
5.8
[Empty string]
4.3
但请注意,目前大多数 XPath 实现仅支持 1.0。在 XSLT 样式表 online here 中尝试这个 XPath 2.0 表达式,这是一个使用 Saxon 9.5 EE 的站点。
我有一个 XML 文件作为 "sample.xml" 并且有 4 条记录 .
<?xml version='1.0' encoding='UTF-8'?>
<hello xmlns:show="http://www.example.com" xmlns:css="http://www.example.com" xml_version="2.0">
<entry id="2008-0001">
<show:id>2008-0001</show:id>
<show:published-datetime>2008-01-15T15:00:00.000-05:00</show:published-datetime>
<show:last-modified-datetime>2012-03-19T00:00:00.000-04:00</show:last-modified-datetime>
<show:css>
<css:metrics>
<css:score>3.6</css:score>
<css:access-vector>LOCAL</css:access-vector>
<css:authentication>NONE</css:authentication>
<css:generated-on-datetime>2008-01-15T15:22:00.000-05:00</css:generated-on-datetime>
</css:metrics>
</show:css>
<show:summary>This is first entry.</show:summary>
</entry>
<entry id="2008-0002">
<show:id>2008-0002</show:id>
<show:published-datetime>2008-02-11T20:00:00.000-05:00</show:published-datetime>
<show:last-modified-datetime>2014-03-15T23:22:37.303-04:00</show:last-modified-datetime>
<show:css>
<css:metrics>
<css:score>5.8</css:score>
<css:access-vector>NETWORK</css:access-vector>
<css:authentication>NONE</css:authentication>
<css:generated-on-datetime>2008-02-12T10:12:00.000-05:00</css:generated-on-datetime>
</css:metrics>
</show:css>
<show:summary>This is second entry.</show:summary>
</entry>
<entry id="2008-0003">
<show:id>2008-0003</show:id>
<show:published-datetime>2009-03-26T06:12:08.780-04:00</show:published-datetime>
<show:last-modified-datetime>2009-03-26T06:12:09.313-04:00</show:last-modified-datetime>
<show:summary>This is 3rd entry with missing "css" tag and their metrics.</show:summary>
</entry>
<entry id="2008-0004">
<show:id>CVE-2008-0004</show:id>
<show:published-datetime>2008-01-11T19:46:00.000-05:00</show:published-datetime>
<show:last-modified-datetime>2011-09-06T22:41:45.753-04:00</show:last-modified-datetime>
<show:css>
<css:metrics>
<css:score>4.3</css:score>
<css:access-vector>NETWORK</css:access-vector>
<css:authentication>NONE</css:authentication>
<css:generated-on-datetime>2008-01-14T09:37:00.000-05:00</css:generated-on-datetime>
</css:metrics>
</show:css>
<show:summary>This is 4th entry.</show:summary>
</entry>
</hello>
和 1 个 Java 文件为 "Test.java" -
import java.io.File;
import java.util.ArrayList;
import java.util.List;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class Test {
public static void main(String[] args) {
List<String> list = new ArrayList<String>();
File fXmlFile = new File("/home/ankit/sample.xml");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
try
{
DocumentBuilder dBuilder = factory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
NodeList nList = doc.getElementsByTagName("entry");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
for (int i = 0; i < nList.getLength(); i++)
{
XPathExpression expr1 = xpath.compile("//hello/entry/css/metrics/score");
NodeList nodeList1 = (NodeList) expr1.evaluate(doc, XPathConstants.NODESET);
if(nodeList1.item(i)!=null)
{
Node currentItem = nodeList1.item(i);
if(!currentItem.getTextContent().isEmpty())
{
list.add(currentItem.getTextContent());
}
}
}
}
catch(Exception e)
{
e.printStackTrace();
}
System.out.println("size----"+list.size());
for(int i=0;i<list.size();i++)
{
System.out.println("list----"+list.get(i));
}
}
}
我需要从 XML 读取 <entry>
标签,为此我正在使用 XPath 。在 XML 文件中有 4 个条目标签,在条目标签内部有 <show:css>
标签,但是在第 3 个 <entry>
标签中,这个 <show:css>
标签丢失了,把那些 css 标签在列表中的得分值。因此,当我 运行 时,此 java 代码将前 2 个值存储在列表中,并在第 3 个位置存储第 4 个标签的 css 分值。
我想要一个列表作为输出,其中第一个、第二个和第四个元素为“3.6”、“4.8”和“5.3”,第三个元素应该是空字符串或 nill。但是我在列表中只得到 3 个元素,值为 1,2 和 4。
我需要将空字符串“”放在第三位,原始值放在第四位。表示如果该标签不存在,则将空白值放入列表中。
当前输出 - [“3.6”、“4.8”、“5.3”]
我预计 - [“3.6”、“4.8”、“”、“5.3”]
谁能帮我解决这个问题。
可能有几种方法可以实现...
我的基本做法是找到所有有 css/metrics/score
子节点和没有子节点的 entry
节点(你可能只得到所有 entry
节点, 但这证明了查询语言的强大功能)
类似...
XPathExpression expr1 = xPath.compile("//hello/entry[css/metrics/score or not(css/metrics/score)]");
我知道条件表达式意义不大,我希望 OP 看到他们可以使用额外的条件来扩展那里的要求,谢谢大家指出,尽管我已经做了提一下...希望我们都能继续前进
然后,遍历生成的 NodeList
并查询每个 entry
Node
的 css/metrics/score
节点。如果它是 null
,则将 null
值添加到列表中(或您想要的任何其他值),例如...
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
Document doc = dbf.newDocumentBuilder().parse(JavaApplication908.class.getResourceAsStream("/Hello.xml"));
XPathFactory xf = XPathFactory.newInstance();
XPath xPath = xf.newXPath();
XPathExpression expr1 = xPath.compile("//hello/entry[css/metrics/score or not(css/metrics/score)]");
XPathExpression expr2 = xPath.compile("css/metrics/score");
List<String> values = new ArrayList<>(25);
NodeList nodeList1 = (NodeList) expr1.evaluate(doc, XPathConstants.NODESET);
for (int index = 0; index < nodeList1.getLength(); index++) {
Node node = nodeList1.item(index);
System.out.println(node.getAttributes().getNamedItem("id"));
Node css = (Node) expr2.evaluate(node, XPathConstants.NODE);
if (css != null) {
values.add(css.getTextContent());
} else {
values.add(null);
}
}
for (String value : values) {
System.out.println(value);
}
这输出...
id="2008-0001"
id="2008-0002"
id="2008-0003"
id="2008-0004"
3.6
5.8
null
4.3
(前四行是entry
节点id
s,后四行是结果css/metrics/score
值)
我不是 XPath 专家,但通过查看您的代码,我认为您只是缺少几行代码,
if(nodeList1.item(i)!=null)
{
Node currentItem = nodeList1.item(i);
if(!currentItem.getTextContent().isEmpty())
{
list.add(currentItem.getTextContent());
}
else
list.add("");
}
else
list.add("");
@MathiasMüller could you please let me know how it can be done in 1 expression in XPath 2.0. – ankit
等效的 XPath 2.0 表达式为
for $x in //entry return (if ($x//*:score) then $x//*:score else '')
它大量使用了 XPath 2.0 中引入的新结构。输出将是
3.6
5.8
[Empty string]
4.3
但请注意,目前大多数 XPath 实现仅支持 1.0。在 XSLT 样式表 online here 中尝试这个 XPath 2.0 表达式,这是一个使用 Saxon 9.5 EE 的站点。