用 JSoup 写 XML
Write XML with JSoup
我已经用 JSoup 解析了一个 xml 文件,现在我想将(修改后的)对象写入一个新的 xml 文件。
问题是 JSoup 添加了一堆元头 html 数据。
它应该这样开始:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE score-partwise PUBLIC "-//Recordare//DTD MusicXML 2.0 Partwise//EN" "http://www.musicxml.org/dtds/partwise.dtd">
<score-partwise>
<identification>
<encoding>
但实际上是这样开始的:
<!--?xml version="1.0" encoding="UTF-8"?--><!DOCTYPE score-partwise PUBLIC "-//Recordare//DTD MusicXML 2.0 Partwise//EN" "http://www.musicxml.org/dtds/partwise.dtd">
<html>
<head></head>
<body>
<score-partwise>
<identification>
<encoding>
<software>
MuseScore 1.3
</software>
<encoding-date>
2015-01-31
</encoding-date>
</encoding>
<source>http://musescore.com/score/161981
</identification>
<defaults>
<scaling>
<millimeters>
7.056
</millimeters>
<tenths>
40
</tenths>
</scaling>
<page-layout>
<page-height>
1683.67
</page-height>
<page-width>
1190.48
</page-width>
我已经加载了这样的文件:
if (doc.getElementsByTag("note").isEmpty()) {
doc = Jsoup.parse(input, "UTF-16", filename);
if (doc.getElementsByTag("note").isEmpty()) {
System.out.println("Please check that your file is encoded in UTF-8 or UTF-16 and contains notes.");
}
}
并试过这样写:
BufferedWriter htmlWriter = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("output.xml"), "UTF-8"));
htmlWriter.write(doc.outerHtml());
-> 我也试过 doc.html() 和 doc.toString() 。仍然是相同的输出。
有什么想法吗?我只是希望它以与阅读时相同的方式编写。
这解决了它:
InputStream is = new FileInputStream(filename) {
@Override
public int read() throws IOException {
return 0;
}
};
doc = Jsoup.parse(is, "UTF-8", "", Parser.xmlParser());
if (doc.getElementsByTag("note").isEmpty()) {
doc = Jsoup.parse(is, "UTF-8", "", Parser.xmlParser());
if (doc.getElementsByTag("note").isEmpty()) {
System.out.println("Please check that your file is encoded in UTF-8 or UTF-16 and contains notes.");
}
}
我已经用 JSoup 解析了一个 xml 文件,现在我想将(修改后的)对象写入一个新的 xml 文件。
问题是 JSoup 添加了一堆元头 html 数据。
它应该这样开始:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE score-partwise PUBLIC "-//Recordare//DTD MusicXML 2.0 Partwise//EN" "http://www.musicxml.org/dtds/partwise.dtd">
<score-partwise>
<identification>
<encoding>
但实际上是这样开始的:
<!--?xml version="1.0" encoding="UTF-8"?--><!DOCTYPE score-partwise PUBLIC "-//Recordare//DTD MusicXML 2.0 Partwise//EN" "http://www.musicxml.org/dtds/partwise.dtd">
<html>
<head></head>
<body>
<score-partwise>
<identification>
<encoding>
<software>
MuseScore 1.3
</software>
<encoding-date>
2015-01-31
</encoding-date>
</encoding>
<source>http://musescore.com/score/161981
</identification>
<defaults>
<scaling>
<millimeters>
7.056
</millimeters>
<tenths>
40
</tenths>
</scaling>
<page-layout>
<page-height>
1683.67
</page-height>
<page-width>
1190.48
</page-width>
我已经加载了这样的文件:
if (doc.getElementsByTag("note").isEmpty()) {
doc = Jsoup.parse(input, "UTF-16", filename);
if (doc.getElementsByTag("note").isEmpty()) {
System.out.println("Please check that your file is encoded in UTF-8 or UTF-16 and contains notes.");
}
}
并试过这样写:
BufferedWriter htmlWriter = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("output.xml"), "UTF-8"));
htmlWriter.write(doc.outerHtml());
-> 我也试过 doc.html() 和 doc.toString() 。仍然是相同的输出。
有什么想法吗?我只是希望它以与阅读时相同的方式编写。
这解决了它:
InputStream is = new FileInputStream(filename) {
@Override
public int read() throws IOException {
return 0;
}
};
doc = Jsoup.parse(is, "UTF-8", "", Parser.xmlParser());
if (doc.getElementsByTag("note").isEmpty()) {
doc = Jsoup.parse(is, "UTF-8", "", Parser.xmlParser());
if (doc.getElementsByTag("note").isEmpty()) {
System.out.println("Please check that your file is encoded in UTF-8 or UTF-16 and contains notes.");
}
}