如何使用 lxml 获取 XML 声明字符串

How can I get XML declaration string with lxml

我使用lxml解析XML文档 如何获取声明字符串?

 <?xml version="1.0" encoding="utf-8" ?> 

我想检查它是否存在,它有什么编码以及什么 xml 版本。

解析文档时,生成的 ElementTree 对象应该有一个 DocInfo 对象,其中包含有关已解析的 XML 或 HTML 文档的信息。

对于XML,您可能对此DocInfoxml_versionencoding属性感兴趣:

>>> from lxml import etree
>>> tree = etree.parse('input.xml')
>>> tree.docinfo
<lxml.etree.DocInfo object at 0x7f8111f9ecc0>
>>> tree.docinfo.xml_version
'1.0'
>>> tree.docinfo.encoding
'UTF-8'

也许您应该检查是否可以在您的 XML 文件中找到具有该声明值 ( ) 的字符串:

    def matchLine(path, line_number, text):
        """
        path = used for defining the file to be checked
        line_number = used to identify the line that  will be checked
        text = string containing the text to match
        """
        file = open(path)
        line_file = file.readline()
        line_file = line_file.rstrip()
        line_no = 1
        while line_file != "":
            if line_no == line_number:
                if line_file == text:
                    return True
                else:
                    return False
            line_no = line_no+1
            line_file = file.readline()
            line_file = line_file.rstrip()