如何从 xml 中检索 html?
How to retrieve html from an xml?
我正在尝试从 XML 文件中获取 HTML-代码,但我得到的只是单个元素。
XML-示例:
<?xml version="1.0" encoding="ISO-8859-1"?>
<websites>
<website name="1">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title/>
</head><body>Sample Content.....</body>
</html>
</website>
</websites>
我需要一个只包含 html 这样的字符串
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title/>
</head><body>Sample Content.....</body>
</html>
您可以使用 beautifulsoup:
from bs4 import BeautifulSoup
example = """
<?xml version="1.0" encoding="ISO-8859-1"?>
<websites>
<website name="1">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title/>
</head><body>Sample Content.....</body>
</html>
</website>
</websites>
"""
soup = BeautifulSoup(example)
html = soup.find('html')
print(html)
输出:
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
</head><body>Sample Content.....</body>
</html>
我正在尝试从 XML 文件中获取 HTML-代码,但我得到的只是单个元素。
XML-示例:
<?xml version="1.0" encoding="ISO-8859-1"?>
<websites>
<website name="1">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title/>
</head><body>Sample Content.....</body>
</html>
</website>
</websites>
我需要一个只包含 html 这样的字符串
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title/>
</head><body>Sample Content.....</body>
</html>
您可以使用 beautifulsoup:
from bs4 import BeautifulSoup
example = """
<?xml version="1.0" encoding="ISO-8859-1"?>
<websites>
<website name="1">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title/>
</head><body>Sample Content.....</body>
</html>
</website>
</websites>
"""
soup = BeautifulSoup(example)
html = soup.find('html')
print(html)
输出:
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
</head><body>Sample Content.....</body>
</html>