如何使用 lxml 获取 XML 声明字符串
How can I get XML declaration string with lxml
我使用lxml
解析XML文档
如何获取声明字符串?
<?xml version="1.0" encoding="utf-8" ?>
我想检查它是否存在,它有什么编码以及什么 xml 版本。
解析文档时,生成的 ElementTree
对象应该有一个 DocInfo
对象,其中包含有关已解析的 XML 或 HTML 文档的信息。
对于XML,您可能对此DocInfo
的xml_version
和encoding
属性感兴趣:
>>> from lxml import etree
>>> tree = etree.parse('input.xml')
>>> tree.docinfo
<lxml.etree.DocInfo object at 0x7f8111f9ecc0>
>>> tree.docinfo.xml_version
'1.0'
>>> tree.docinfo.encoding
'UTF-8'
也许您应该检查是否可以在您的 XML 文件中找到具有该声明值 ( ) 的字符串:
def matchLine(path, line_number, text):
"""
path = used for defining the file to be checked
line_number = used to identify the line that will be checked
text = string containing the text to match
"""
file = open(path)
line_file = file.readline()
line_file = line_file.rstrip()
line_no = 1
while line_file != "":
if line_no == line_number:
if line_file == text:
return True
else:
return False
line_no = line_no+1
line_file = file.readline()
line_file = line_file.rstrip()
我使用lxml
解析XML文档
如何获取声明字符串?
<?xml version="1.0" encoding="utf-8" ?>
我想检查它是否存在,它有什么编码以及什么 xml 版本。
解析文档时,生成的 ElementTree
对象应该有一个 DocInfo
对象,其中包含有关已解析的 XML 或 HTML 文档的信息。
对于XML,您可能对此DocInfo
的xml_version
和encoding
属性感兴趣:
>>> from lxml import etree
>>> tree = etree.parse('input.xml')
>>> tree.docinfo
<lxml.etree.DocInfo object at 0x7f8111f9ecc0>
>>> tree.docinfo.xml_version
'1.0'
>>> tree.docinfo.encoding
'UTF-8'
也许您应该检查是否可以在您的 XML 文件中找到具有该声明值 ( ) 的字符串:
def matchLine(path, line_number, text):
"""
path = used for defining the file to be checked
line_number = used to identify the line that will be checked
text = string containing the text to match
"""
file = open(path)
line_file = file.readline()
line_file = line_file.rstrip()
line_no = 1
while line_file != "":
if line_no == line_number:
if line_file == text:
return True
else:
return False
line_no = line_no+1
line_file = file.readline()
line_file = line_file.rstrip()