通过 UnMarshal 和 MarshalIndent 的往返 xml
Round trip xml through UnMarshal and MarshalIndent
我想快速创建一个实用程序来使用 golang 的 xml.MarshalIndent()
格式化任何 XML 数据
package main
import (
"encoding/xml"
"fmt"
)
func main() {
type node struct {
XMLName xml.Name
Attrs []xml.Attr `xml:",attr"`
Text string `xml:",chardata"`
Children []node `xml:",any"`
}
x := node{}
_ = xml.Unmarshal([]byte(doc), &x)
buf, _ := xml.MarshalIndent(x, "", " ") // prefix, indent
fmt.Println(string(buf))
}
const doc string = `<book lang="en">
<title>The old man and the sea</title>
<author>Hemingway</author>
</book>`
生产
<book>
 
 

<title>The old man and the sea</title>
<author>Hemingway</author>
</book>
注意 <book>
打开元素后的无关内容。
- 我失去了我的属性 - 为什么?
- 我想避免收集虚假的元素间字符数据 - 怎么做?
首先,您没有正确使用属性结构标签,所以这是一个简单的解决方法。
来自https://godoc.org/encoding/xml#Unmarshal
- If the XML element has an attribute not handled by the previous
rule and the struct has a field with an associated tag containing
",any,attr", Unmarshal records the attribute value in the first
such field.
其次,因为标签 xml:",chardata"
甚至没有通过 xml.Unmarshaller
接口的 UnmarshalXML
传递那个字段,你不能简单地为 [=14 创建一个新类型=] 并按照相同文档中的说明为其实现该接口。 (注意除[]byte或string以外的任何类型都会强制报错)
- If the XML element contains character data, that data is
accumulated in the first struct field that has tag ",chardata".
The struct field may have type []byte or string.
If there is no such field, the character data is discarded.
因此,处理不需要的字符的最简单方法是事后替换它们。
这里有完整的代码示例:https://play.golang.org/p/VSDskgfcLng
var Replacer = strings.NewReplacer("
","","	","","\n","","\t","")
func recursiveReplace(n *Node) {
n.Text = Replacer.Replace(n.Text)
for i := range n.Children {
recursiveReplace(&n.Children[i])
}
}
理论上可以为 Node
实现 xml.Unmarshaller
接口,但是你不仅要处理手动 xml 解析,还要处理它是递归结构的事实.事后删除不需要的字符是最简单的。
我想快速创建一个实用程序来使用 golang 的 xml.MarshalIndent()
格式化任何 XML 数据package main
import (
"encoding/xml"
"fmt"
)
func main() {
type node struct {
XMLName xml.Name
Attrs []xml.Attr `xml:",attr"`
Text string `xml:",chardata"`
Children []node `xml:",any"`
}
x := node{}
_ = xml.Unmarshal([]byte(doc), &x)
buf, _ := xml.MarshalIndent(x, "", " ") // prefix, indent
fmt.Println(string(buf))
}
const doc string = `<book lang="en">
<title>The old man and the sea</title>
<author>Hemingway</author>
</book>`
生产
<book>
 
 

<title>The old man and the sea</title>
<author>Hemingway</author>
</book>
注意 <book>
打开元素后的无关内容。
- 我失去了我的属性 - 为什么?
- 我想避免收集虚假的元素间字符数据 - 怎么做?
首先,您没有正确使用属性结构标签,所以这是一个简单的解决方法。
来自https://godoc.org/encoding/xml#Unmarshal
- If the XML element has an attribute not handled by the previous rule and the struct has a field with an associated tag containing ",any,attr", Unmarshal records the attribute value in the first such field.
其次,因为标签 xml:",chardata"
甚至没有通过 xml.Unmarshaller
接口的 UnmarshalXML
传递那个字段,你不能简单地为 [=14 创建一个新类型=] 并按照相同文档中的说明为其实现该接口。 (注意除[]byte或string以外的任何类型都会强制报错)
- If the XML element contains character data, that data is accumulated in the first struct field that has tag ",chardata". The struct field may have type []byte or string. If there is no such field, the character data is discarded.
因此,处理不需要的字符的最简单方法是事后替换它们。
这里有完整的代码示例:https://play.golang.org/p/VSDskgfcLng
var Replacer = strings.NewReplacer("
","","	","","\n","","\t","")
func recursiveReplace(n *Node) {
n.Text = Replacer.Replace(n.Text)
for i := range n.Children {
recursiveReplace(&n.Children[i])
}
}
理论上可以为 Node
实现 xml.Unmarshaller
接口,但是你不仅要处理手动 xml 解析,还要处理它是递归结构的事实.事后删除不需要的字符是最简单的。