Spring Jaxb2:如何将批处理数据附加到 XML 文件而不将其读取到内存中?

Spring Jaxb2: How to append batch data to XML file with no reading it to memory?

我需要批量写入数据到xml

有以下域对象:

@XmlRootElement(name = "country")
public class Country {
    @XmlElements({@XmlElement(name = "town", type = Town.class)})
    private Collection<Town> towns = new ArrayList<>();
    ....
}

并且:

@XmlRootElement(name = "town")
public class Town {
    @XmlElement
    private String townName;
    // etc
}

我正在使用 Jaxb2 编组对象。配置如下:

marshaller = new Jaxb2Marshaller();
marshaller.setClassesToBeBound(Country.class, Town.class);

因为简单的编组在这里不起作用 marhaller.marshall(fileName, country) - 它格式不正确 xml.

有没有一种方法可以调整 marhaller,以便在所有编组数据不存在的情况下创建文件,或者如果存在则将其附加到 xml 文件的末尾?

此外,由于此文件可能很大,我不想读取内存中的整个文件、追加数据然后写入磁盘。

我使用 StAX 进行 xml 处理,因为它基于流,消耗的内存比 DOM 少,并且与只能解析 xml 的 SAX 相比具有读写能力数据,但无法写入。

这是我想出的方法:

public enum StAXBatchWriter {
    INSTANCE;
    private static final Logger LOGGER = LoggerFactory.getLogger(StAXBatchWriter.class);

    public void writeUrls(File original, Collection<Town> towns) {
        XMLEventReader eventReader = null;
        XMLEventWriter eventWriter = null;
        try {
            String originalPath = original.getPath();
            File from = new File(original.getParent() + "/old-" + original.getName());
            boolean isRenamed = original.renameTo(from);
            if (!isRenamed)
                throw new IllegalStateException("Failed to rename file: " + original.getPath() + " to " + from.getPath());
            File to = new File(originalPath);

            XMLInputFactory inFactory = XMLInputFactory.newInstance();
            eventReader = inFactory.createXMLEventReader(new FileInputStream(from));

            XMLOutputFactory outFactory = XMLOutputFactory.newInstance();
            eventWriter = outFactory.createXMLEventWriter(new FileWriter(to));

            XMLEventFactory eventFactory = XMLEventFactory.newInstance();

            while (eventReader.hasNext()) {
                XMLEvent event = eventReader.nextEvent();
                eventWriter.add(event);
                if (event.getEventType() == XMLEvent.START_ELEMENT && event.asStartElement().getName().toString().contains("country")) {
                    for (Town town : towns) {
                        writeTown(eventWriter, eventFactory, town);
                    }
                }
            }
            boolean isDeleted = from.delete();
            if (!isDeleted)
                throw new IllegalStateException("Failed to delete old file: " + from.getPath());
        } catch (IOException | XMLStreamException e) {
            LOGGER.error(e.getMessage(), e);
            throw new RuntimeException(e);
        } finally {
            try {
                if (eventReader != null)
                    eventReader.close();
            } catch (XMLStreamException e) {
                LOGGER.error(e.getMessage(), e);
            }
            try {
                if (eventWriter != null)
                    eventWriter.close();
            } catch (XMLStreamException e) {
                LOGGER.error(e.getMessage(), e);
            }
        }
    }

    private void writeTown(XMLEventWriter eventWriter, XMLEventFactory eventFactory, Town town) throws XMLStreamException {
        eventWriter.add(eventFactory.createStartElement("", null, "town"));

        // write town id
        eventWriter.add(eventFactory.createStartElement("", null, "id"));
        eventWriter.add(eventFactory.createCharacters(town.getId()));
        eventWriter.add(eventFactory.createEndElement("", null, "id"));

        //write town name
        if (StringUtils.isNotEmpty(town.getName())) {
            eventWriter.add(eventFactory.createStartElement("", null, "name"));
            eventWriter.add(eventFactory.createCharacters(town.getName()));
            eventWriter.add(eventFactory.createEndElement("", null, "name"));
        }

        // write other fields

        eventWriter.add(eventFactory.createEndElement("", null, "town"));
    }
}

这不是最好的方法,尽管它是基于流的并且可以正常工作,但它有一些开销。添加批次时 - 必须重新读取旧文件。

如果有一个选项可以在文件中的某个位置追加数据(比如 "append data to that file after 4 line"),那会很好,但似乎无法做到这一点。