在文档具有可选子元素的情况下使用 Rust 和 Serde 反序列化 XML 有困难
Difficulties deserializing XML using Rust and Serde where document has optional subelements
我是 Rust 的新手,我仍在努力掌握使用它的窍门。这很酷,但是我给自己做的练习显然遗漏了一些东西。作为参考,我使用的是 rustc 1.39.0.
我想尝试编写一个简单的程序来从 MSBuild 的代码分析中读取 XML,它输出一些相当简单的 XML。我认为的问题是有一个元素(PATH
)通常是空的,但有时它下面可以包含元素。更大的问题是我不喜欢 Rust(而且我通常不处理 XML),而且我不确定如何正确设置反序列化所需的结构。我正在使用 Serde 和 quick_xml。当我将 PATH
设置为字符串并使用在 PATH 下没有 SFA 元素的 XML 时,我的测试成功了。但是一旦我弄清楚 应该如何使用该标签 并相应地更新我的结构,我就会不断收到错误消息:
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom("missing field `FILEPATH`")', src\libcore\result.rs:1165:5
...即使测试 XML 文件中的 所有 缺陷在 PATH 下都有 SFA 元素。
我正在处理的 XML 个文件都是这样的:
<?xml version="1.0" encoding="utf-8"?>
<DEFECTS>
<DEFECT>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>8</LINE>
<COLUMN>5</COLUMN>
</SFA>
<DEFECTCODE>26496</DEFECTCODE>
<DESCRIPTION>The variable 'y' is assigned only once, mark it as const (con.4).</DESCRIPTION>
<FUNCTION>main</FUNCTION>
<DECORATED>main</DECORATED>
<FUNCLINE>6</FUNCLINE>
<PATH></PATH>
</DEFECT>
<DEFECT>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>9</LINE>
<COLUMN>5</COLUMN>
</SFA>
<DEFECTCODE>26496</DEFECTCODE>
<DESCRIPTION>The variable 'z' is assigned only once, mark it as const (con.4).</DESCRIPTION>
<FUNCTION>main</FUNCTION>
<DECORATED>main</DECORATED>
<FUNCLINE>6</FUNCLINE>
<PATH></PATH>
</DEFECT>
</DEFECTS>
在许多情况下,PATH
是空的,但在某些情况下它包含自己的 SFA
元素:
<DEFECT>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>9</LINE>
<COLUMN>5</COLUMN>
</SFA>
<DEFECTCODE>26496</DEFECTCODE>
<DESCRIPTION>The variable 'z' is assigned only once, mark it as const (con.4).</DESCRIPTION>
<FUNCTION>main</FUNCTION>
<DECORATED>main</DECORATED>
<FUNCLINE>6</FUNCLINE>
<PATH>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>12</LINE>
<COLUMN>3</COLUMN>
</SFA>
</PATH>
</DEFECT>
在我意识到这一点之前,DEFECT 结构中的所有字段都设置为 String。假设 XML 文件中的 none 个缺陷在 PATH 下有子元素,这可以正常工作。当我将它更改为 SFA 而不是 String 时,它会给我上面提到的缺失字段错误。我正在测试的代码示例:
main.rs
extern crate quick_xml;
extern crate serde;
use std::default::Default;
use std::env;
use std::vec::Vec;
use quick_xml::de::from_str;
use serde::{Serialize, Deserialize};
/*
* Structs for the defect XML
*/
#[derive(Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct DEFECTS {
#[serde(rename = "DEFECT", default)]
pub defects: Vec<DEFECT>,
}
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct DEFECT {
#[serde(default)]
pub SFA: SFA,
pub DEFECTCODE: String,
pub DESCRIPTION: String,
pub FUNCTION: String,
pub DECORATED: String,
pub FUNCLINE: String,
#[serde(default)]
pub PATH: Vec<SFA>,
}
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct SFA {
pub FILEPATH: String,
pub FILENAME: String,
pub LINE: String,
pub COLUMN: String,
}
/*
* Main app code
*/
fn main() {
// Expect the path to the XML file to be passed as the first and only argument
let args: Vec<String> = env::args().collect();
if args.len() != 2 {
panic!("Invalid argument count. Specify a single file to process.");
}
let processing_file = &args[1];
println!("Will attempt to process file: '{}'", &processing_file);
// Try to load the contents of the file
let file_content : String = match std::fs::read_to_string(&processing_file) {
Ok(file_content) => file_content,
Err(e) => {
panic!("Failed to read file: '{}' -- {}", &processing_file, e);
}
};
// Now, try to deserialize the XML we have in file_content
let defect_list : DEFECTS = from_str(&file_content).unwrap();
// Assuming the unwrap above didn't blow up, we should get a count here
println!("Retrieved {} defects from file '{}'", defect_list.defects.len(), &processing_file);
}
Cargo.toml
[package]
name = "rust_xml_test"
version = "0.1.0"
authors = ["fny82"]
edition = "2018"
[dependencies]
quick-xml = { version = "0.17", features = [ "serialize" ] }
serde = { version = "1.0", features = [ "derive" ] }
示例输出
C:\Development\RustXmlTest>cargo run -- "c:\development\rustxmltest\test3.xml"
Compiling rust_xml_test v0.1.0 (C:\Development\RustXmlTest)
Finished dev [unoptimized + debuginfo] target(s) in 1.56s
Running `target\debug\rust_xml_test.exe c:\development\rustxmltest\test3.xml`
Will attempt to process file: 'c:\development\rustxmltest\test3.xml'
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom("missing field `FILEPATH`")', src\libcore\result.rs:1165:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
error: process didn't exit successfully: `target\debug\rust_xml_test.exe c:\development\rustxmltest\test3.xml` (exit code: 101)
我确定我在做一些愚蠢的事情,部分原因可能是我在挑战的范围和我目前对使用 Rust 的理解水平方面超越了自己。谁能帮助我解决我遗漏和做错的事情?
有点相关:从那以后我了解到我可以使用 rename
属性 以使我的结构符合 Rust 的命名约定,但现在我不想开始弄乱它,直到我让底层功能正常工作。
----编辑----
供参考,@edwardw 更正了现在可用的代码:
extern crate quick_xml;
extern crate serde;
use std::default::Default;
use std::env;
use std::vec::Vec;
use quick_xml::de::from_str;
use serde::{Serialize, Deserialize};
/*
* Structs for the defect XML
*/
#[derive(Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct DEFECTS {
#[serde(rename = "DEFECT", default)]
pub defects: Vec<DEFECT>,
}
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct DEFECT {
#[serde(default)]
pub SFA: SFA,
pub DEFECTCODE: String,
pub DESCRIPTION: String,
pub FUNCTION: String,
pub DECORATED: String,
pub FUNCLINE: String,
pub PATH: PATH,
}
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct SFA {
pub FILEPATH: String,
pub FILENAME: String,
pub LINE: String,
pub COLUMN: String,
}
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct PATH {
pub SFA: Option<SFA>,
}
/*
* Main app code
*/
fn main() {
// Expect the path to the XML file to be passed as the first and only argument
let args: Vec<String> = env::args().collect();
if args.len() != 2 {
panic!("Invalid argument count. Specify a single file to process.");
}
let processing_file = &args[1];
println!("Will attempt to process file: '{}'", &processing_file);
// Try to load the contents of the file
let file_content : String = match std::fs::read_to_string(&processing_file) {
Ok(file_content) => file_content,
Err(e) => {
panic!("Failed to read file: '{}' -- {}", &processing_file, e);
}
};
// Now, try to deserialize the XML we have in file_content
let defect_list : DEFECTS = from_str(&file_content).unwrap();
// Assuming the unwrap above didn't blow up, we should get a count here
println!("Retrieved {} defects from file '{}'", defect_list.defects.len(), &processing_file);
}
示例:
C:\Development\RustXmlTest>cargo run -- "c:\development\rustxmltest\test1.xml"
Compiling rust_xml_test v0.1.0 (C:\Development\RustXmlTest)
Finished dev [unoptimized + debuginfo] target(s) in 1.66s
Running `target\debug\rust_xml_test.exe c:\development\rustxmltest\test1.xml`
Will attempt to process file: 'c:\development\rustxmltest\test1.xml'
Retrieved 2 defects from file 'c:\development\rustxmltest\test1.xml'
其中 test1.xml 包含:
<?xml version="1.0" encoding="utf-8"?>
<DEFECTS>
<DEFECT>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>8</LINE>
<COLUMN>5</COLUMN>
</SFA>
<DEFECTCODE>26496</DEFECTCODE>
<DESCRIPTION>The variable 'y' is assigned only once, mark it as const (con.4).</DESCRIPTION>
<FUNCTION>main</FUNCTION>
<DECORATED>main</DECORATED>
<FUNCLINE>6</FUNCLINE>
<PATH></PATH>
</DEFECT>
<DEFECT>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>9</LINE>
<COLUMN>5</COLUMN>
</SFA>
<DEFECTCODE>26496</DEFECTCODE>
<DESCRIPTION>The variable 'z' is assigned only once, mark it as const (con.4).</DESCRIPTION>
<FUNCTION>main</FUNCTION>
<DECORATED>main</DECORATED>
<FUNCLINE>6</FUNCLINE>
<PATH>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>12</LINE>
<COLUMN>3</COLUMN>
</SFA>
</PATH>
</DEFECT>
</DEFECTS>
PATH
本身应该建模为具有一个可选字段的结构。这有效:
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct DEFECT {
#[serde(default)]
pub SFA: SFA,
pub DEFECTCODE: String,
pub DESCRIPTION: String,
pub FUNCTION: String,
pub DECORATED: String,
pub FUNCLINE: String,
pub PATH: PATH,
}
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct PATH {
SFA: Option<SFA>,
}
我是 Rust 的新手,我仍在努力掌握使用它的窍门。这很酷,但是我给自己做的练习显然遗漏了一些东西。作为参考,我使用的是 rustc 1.39.0.
我想尝试编写一个简单的程序来从 MSBuild 的代码分析中读取 XML,它输出一些相当简单的 XML。我认为的问题是有一个元素(PATH
)通常是空的,但有时它下面可以包含元素。更大的问题是我不喜欢 Rust(而且我通常不处理 XML),而且我不确定如何正确设置反序列化所需的结构。我正在使用 Serde 和 quick_xml。当我将 PATH
设置为字符串并使用在 PATH 下没有 SFA 元素的 XML 时,我的测试成功了。但是一旦我弄清楚 应该如何使用该标签 并相应地更新我的结构,我就会不断收到错误消息:
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom("missing field `FILEPATH`")', src\libcore\result.rs:1165:5
...即使测试 XML 文件中的 所有 缺陷在 PATH 下都有 SFA 元素。
我正在处理的 XML 个文件都是这样的:
<?xml version="1.0" encoding="utf-8"?>
<DEFECTS>
<DEFECT>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>8</LINE>
<COLUMN>5</COLUMN>
</SFA>
<DEFECTCODE>26496</DEFECTCODE>
<DESCRIPTION>The variable 'y' is assigned only once, mark it as const (con.4).</DESCRIPTION>
<FUNCTION>main</FUNCTION>
<DECORATED>main</DECORATED>
<FUNCLINE>6</FUNCLINE>
<PATH></PATH>
</DEFECT>
<DEFECT>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>9</LINE>
<COLUMN>5</COLUMN>
</SFA>
<DEFECTCODE>26496</DEFECTCODE>
<DESCRIPTION>The variable 'z' is assigned only once, mark it as const (con.4).</DESCRIPTION>
<FUNCTION>main</FUNCTION>
<DECORATED>main</DECORATED>
<FUNCLINE>6</FUNCLINE>
<PATH></PATH>
</DEFECT>
</DEFECTS>
在许多情况下,PATH
是空的,但在某些情况下它包含自己的 SFA
元素:
<DEFECT>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>9</LINE>
<COLUMN>5</COLUMN>
</SFA>
<DEFECTCODE>26496</DEFECTCODE>
<DESCRIPTION>The variable 'z' is assigned only once, mark it as const (con.4).</DESCRIPTION>
<FUNCTION>main</FUNCTION>
<DECORATED>main</DECORATED>
<FUNCLINE>6</FUNCLINE>
<PATH>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>12</LINE>
<COLUMN>3</COLUMN>
</SFA>
</PATH>
</DEFECT>
在我意识到这一点之前,DEFECT 结构中的所有字段都设置为 String。假设 XML 文件中的 none 个缺陷在 PATH 下有子元素,这可以正常工作。当我将它更改为 SFA 而不是 String 时,它会给我上面提到的缺失字段错误。我正在测试的代码示例:
main.rs
extern crate quick_xml;
extern crate serde;
use std::default::Default;
use std::env;
use std::vec::Vec;
use quick_xml::de::from_str;
use serde::{Serialize, Deserialize};
/*
* Structs for the defect XML
*/
#[derive(Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct DEFECTS {
#[serde(rename = "DEFECT", default)]
pub defects: Vec<DEFECT>,
}
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct DEFECT {
#[serde(default)]
pub SFA: SFA,
pub DEFECTCODE: String,
pub DESCRIPTION: String,
pub FUNCTION: String,
pub DECORATED: String,
pub FUNCLINE: String,
#[serde(default)]
pub PATH: Vec<SFA>,
}
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct SFA {
pub FILEPATH: String,
pub FILENAME: String,
pub LINE: String,
pub COLUMN: String,
}
/*
* Main app code
*/
fn main() {
// Expect the path to the XML file to be passed as the first and only argument
let args: Vec<String> = env::args().collect();
if args.len() != 2 {
panic!("Invalid argument count. Specify a single file to process.");
}
let processing_file = &args[1];
println!("Will attempt to process file: '{}'", &processing_file);
// Try to load the contents of the file
let file_content : String = match std::fs::read_to_string(&processing_file) {
Ok(file_content) => file_content,
Err(e) => {
panic!("Failed to read file: '{}' -- {}", &processing_file, e);
}
};
// Now, try to deserialize the XML we have in file_content
let defect_list : DEFECTS = from_str(&file_content).unwrap();
// Assuming the unwrap above didn't blow up, we should get a count here
println!("Retrieved {} defects from file '{}'", defect_list.defects.len(), &processing_file);
}
Cargo.toml
[package]
name = "rust_xml_test"
version = "0.1.0"
authors = ["fny82"]
edition = "2018"
[dependencies]
quick-xml = { version = "0.17", features = [ "serialize" ] }
serde = { version = "1.0", features = [ "derive" ] }
示例输出
C:\Development\RustXmlTest>cargo run -- "c:\development\rustxmltest\test3.xml"
Compiling rust_xml_test v0.1.0 (C:\Development\RustXmlTest)
Finished dev [unoptimized + debuginfo] target(s) in 1.56s
Running `target\debug\rust_xml_test.exe c:\development\rustxmltest\test3.xml`
Will attempt to process file: 'c:\development\rustxmltest\test3.xml'
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom("missing field `FILEPATH`")', src\libcore\result.rs:1165:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
error: process didn't exit successfully: `target\debug\rust_xml_test.exe c:\development\rustxmltest\test3.xml` (exit code: 101)
我确定我在做一些愚蠢的事情,部分原因可能是我在挑战的范围和我目前对使用 Rust 的理解水平方面超越了自己。谁能帮助我解决我遗漏和做错的事情?
有点相关:从那以后我了解到我可以使用 rename
属性 以使我的结构符合 Rust 的命名约定,但现在我不想开始弄乱它,直到我让底层功能正常工作。
----编辑----
供参考,@edwardw 更正了现在可用的代码:
extern crate quick_xml;
extern crate serde;
use std::default::Default;
use std::env;
use std::vec::Vec;
use quick_xml::de::from_str;
use serde::{Serialize, Deserialize};
/*
* Structs for the defect XML
*/
#[derive(Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct DEFECTS {
#[serde(rename = "DEFECT", default)]
pub defects: Vec<DEFECT>,
}
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct DEFECT {
#[serde(default)]
pub SFA: SFA,
pub DEFECTCODE: String,
pub DESCRIPTION: String,
pub FUNCTION: String,
pub DECORATED: String,
pub FUNCLINE: String,
pub PATH: PATH,
}
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct SFA {
pub FILEPATH: String,
pub FILENAME: String,
pub LINE: String,
pub COLUMN: String,
}
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct PATH {
pub SFA: Option<SFA>,
}
/*
* Main app code
*/
fn main() {
// Expect the path to the XML file to be passed as the first and only argument
let args: Vec<String> = env::args().collect();
if args.len() != 2 {
panic!("Invalid argument count. Specify a single file to process.");
}
let processing_file = &args[1];
println!("Will attempt to process file: '{}'", &processing_file);
// Try to load the contents of the file
let file_content : String = match std::fs::read_to_string(&processing_file) {
Ok(file_content) => file_content,
Err(e) => {
panic!("Failed to read file: '{}' -- {}", &processing_file, e);
}
};
// Now, try to deserialize the XML we have in file_content
let defect_list : DEFECTS = from_str(&file_content).unwrap();
// Assuming the unwrap above didn't blow up, we should get a count here
println!("Retrieved {} defects from file '{}'", defect_list.defects.len(), &processing_file);
}
示例:
C:\Development\RustXmlTest>cargo run -- "c:\development\rustxmltest\test1.xml"
Compiling rust_xml_test v0.1.0 (C:\Development\RustXmlTest)
Finished dev [unoptimized + debuginfo] target(s) in 1.66s
Running `target\debug\rust_xml_test.exe c:\development\rustxmltest\test1.xml`
Will attempt to process file: 'c:\development\rustxmltest\test1.xml'
Retrieved 2 defects from file 'c:\development\rustxmltest\test1.xml'
其中 test1.xml 包含:
<?xml version="1.0" encoding="utf-8"?>
<DEFECTS>
<DEFECT>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>8</LINE>
<COLUMN>5</COLUMN>
</SFA>
<DEFECTCODE>26496</DEFECTCODE>
<DESCRIPTION>The variable 'y' is assigned only once, mark it as const (con.4).</DESCRIPTION>
<FUNCTION>main</FUNCTION>
<DECORATED>main</DECORATED>
<FUNCLINE>6</FUNCLINE>
<PATH></PATH>
</DEFECT>
<DEFECT>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>9</LINE>
<COLUMN>5</COLUMN>
</SFA>
<DEFECTCODE>26496</DEFECTCODE>
<DESCRIPTION>The variable 'z' is assigned only once, mark it as const (con.4).</DESCRIPTION>
<FUNCTION>main</FUNCTION>
<DECORATED>main</DECORATED>
<FUNCLINE>6</FUNCLINE>
<PATH>
<SFA>
<FILEPATH>c:\projects\source\repos\defecttest\defecttest</FILEPATH>
<FILENAME>source.cpp</FILENAME>
<LINE>12</LINE>
<COLUMN>3</COLUMN>
</SFA>
</PATH>
</DEFECT>
</DEFECTS>
PATH
本身应该建模为具有一个可选字段的结构。这有效:
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct DEFECT {
#[serde(default)]
pub SFA: SFA,
pub DEFECTCODE: String,
pub DESCRIPTION: String,
pub FUNCTION: String,
pub DECORATED: String,
pub FUNCLINE: String,
pub PATH: PATH,
}
#[derive(Default, Serialize, Deserialize, Debug)]
#[allow(non_snake_case)]
pub struct PATH {
SFA: Option<SFA>,
}