使用 Nokogiri 将 xml 转换为散列,但保留锚标记
Convert xml to hash using Nokogiri but keep the anchor tags
我有如下 xml 个文件。我想解析将其转换为
ruby 散列。我试过这样做:
但它去掉了锚标签,我最终
有这样的描述。
"Today is a "
如何将 xml 转换为散列但保留锚标记?
代码:
@doc = File.open(xml_file) { |f| Nokogiri::XML(f) }
data = Hash.from_xml(@doc.to_s)
XML 文件
<blah>
<tag>
<name>My Name</name>
<url>www.url.com</url>
<file>myfile.zip</file>
<description>Today is a <a href="www.sunny.com">sunny</a></description>
</tag>
<tag>
<name>Someones Name</name>
<url>www.url2.com</url>
<file>myfile2.zip</file>
<description>Today is a <a href="www.rainy.com">rainy</a></description>
</tag>
</blah>
我现在看到的唯一办法就是在整个文档的<description>
里面转义HTML,然后执行Hash#from_xml
:
doc = File.open(xml_file) { |f| Nokogiri::XML(f) }
# escape HTML inside <description>
doc.css("description").each do |node|
node.inner_html = CGI.escapeHTML(node.inner_html)
end
data = Hash.from_xml(doc.to_s) # =>
# {"blah"=>
# {
# "tag"=>[
# {
# "name"=>"My Name",
# "url"=>"www.url.com",
# "file"=>"myfile.zip",
# "description"=>"Today is a <a href=\"www.sunny.com\">sunny</a>"
# },
# {
# "name"=>"Someones Name",
# "url"=>"www.url2.com",
# "file"=>"myfile2.zip",
# "description"=>"Today is a <a href=\"www.rainy.com\">rainy</a>"
# }
# ]
# }
# }
这里使用 Nokogiri 只是为了 HTML 转义。如果你找到另一种逃脱方式,你真的不需要它。例如:
xml = File.open(xml_file).read
# escaping inner HTML (maybe not the best way, just example)
xml.gsub!(/<description>(.*)<\/description>/, "<description>#{CGI.escapeHTML()}</description>")
data = Hash.from_xml(doc.to_s)
我有如下 xml 个文件。我想解析将其转换为 ruby 散列。我试过这样做:
但它去掉了锚标签,我最终 有这样的描述。 "Today is a "
如何将 xml 转换为散列但保留锚标记?
代码:
@doc = File.open(xml_file) { |f| Nokogiri::XML(f) }
data = Hash.from_xml(@doc.to_s)
XML 文件
<blah>
<tag>
<name>My Name</name>
<url>www.url.com</url>
<file>myfile.zip</file>
<description>Today is a <a href="www.sunny.com">sunny</a></description>
</tag>
<tag>
<name>Someones Name</name>
<url>www.url2.com</url>
<file>myfile2.zip</file>
<description>Today is a <a href="www.rainy.com">rainy</a></description>
</tag>
</blah>
我现在看到的唯一办法就是在整个文档的<description>
里面转义HTML,然后执行Hash#from_xml
:
doc = File.open(xml_file) { |f| Nokogiri::XML(f) }
# escape HTML inside <description>
doc.css("description").each do |node|
node.inner_html = CGI.escapeHTML(node.inner_html)
end
data = Hash.from_xml(doc.to_s) # =>
# {"blah"=>
# {
# "tag"=>[
# {
# "name"=>"My Name",
# "url"=>"www.url.com",
# "file"=>"myfile.zip",
# "description"=>"Today is a <a href=\"www.sunny.com\">sunny</a>"
# },
# {
# "name"=>"Someones Name",
# "url"=>"www.url2.com",
# "file"=>"myfile2.zip",
# "description"=>"Today is a <a href=\"www.rainy.com\">rainy</a>"
# }
# ]
# }
# }
这里使用 Nokogiri 只是为了 HTML 转义。如果你找到另一种逃脱方式,你真的不需要它。例如:
xml = File.open(xml_file).read
# escaping inner HTML (maybe not the best way, just example)
xml.gsub!(/<description>(.*)<\/description>/, "<description>#{CGI.escapeHTML()}</description>")
data = Hash.from_xml(doc.to_s)