在 Serde 中实现通用长度分隔的十六进制解串器

Implementing a generic length delimited hex deserializer in Serde

我想使用 Serde 对格式良好的用户提交的输入承担尽可能多的责任。我有许多字段需要输入中特定的、不同长度的十六进制值。

如何在没有重复代码的情况下使用 Serde 强制执行允许的字符集和单个字段的长度?

到目前为止,我尝试了几种不同的方法。都涉及custom deserializers的实现。如果有更简单的解决方案,请告诉我。

一个宏

一个生成两个结构的宏 HexString!($name:ident, $length:expr)Name 保存结果字符串和 NameVisitor 实现 Serde 反序列化访问者。

extern crate serde;
extern crate serde_json;

#[macro_use]
extern crate serde_derive;

#[macro_use]
extern crate error_chain;

error_chain!{}

macro_rules! HexString {
    ($name:ident, $length:expr) => {
        #[derive(Debug, Serialize)]
        pub struct $name(String);

        impl<'de> serde::de::Deserialize<'de> for $name {
            fn deserialize<D>(deserializer: D) -> std::result::Result<Self, D::Error>
            where
                D: serde::de::Deserializer<'de>,
            {
                deserializer.deserialize_str($nameVisitor)
            }
        }

        struct $nameVisitor;

        impl<'de> serde::de::Visitor<'de> for $nameVisitor {
            type Value = $name;

            fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
                write!(
                    formatter,
                    "an string of exactly {} hexadecimal characters",
                    $length
                )
            }

            fn visit_str<E>(self, s: &str) -> std::result::Result<Self::Value, E>
            where
                E: serde::de::Error,
            {
                use serde::de;
                if s.len() != $length {
                    return Err(de::Error::invalid_value(
                        de::Unexpected::Other(&format!(
                            "String is not {} characters long",
                            $length
                        )),
                        &self,
                    ));
                }
                for c in s.chars() {
                    if !c.is_ascii_hexdigit() {
                        return Err(de::Error::invalid_value(de::Unexpected::Char(c), &self));
                    }
                }

                let mut s = s.to_owned();
                s.make_ascii_uppercase();
                Ok($name(s))
            }
        }
    };
}

HexString!(Sha256, 32);

fn main() {
    let h: Sha256 = serde_json::from_str("a412").unwrap(); // should fail
}

Playground

这失败了,因为我无法在模式中连接 $nameVisitor

一个特质

一个特征 HexString 和一个 HexStringVisitor 特征,可能 最后结合一个宏来简化使用:

extern crate serde;
extern crate serde_json;

#[macro_use]
extern crate serde_derive;

#[macro_use]
extern crate error_chain;

error_chain!{}

trait HexString {
    type T: HexString;
    fn init(s: String) -> Self::T;
    fn len() -> usize;
    fn visitor() -> HexStringVisitor<T=Self::T>;
}

impl<'de, T: HexString> serde::de::Deserialize<'de> for T {
    fn deserialize<D>(deserializer: D) -> std::result::Result<Self, D::Error>
    where D: serde::de::Deserializer<'de>
    {
        deserializer.deserialize_str(T::visitor())
    }
}

trait HexStringVisitor {
    type T: HexString;
}

impl<'de, T: HexStringVisitor> serde::de::Visitor<'de> for T {
    type Value = T::T;

    fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
        write!(formatter, "an string of exactly {} hexadecimal characters", Self::Value::len())
    }

    fn visit_str<E>(self, s: &str) -> std::result::Result<Self::Value, E>
        where E: serde::de::Error
    {
        use serde::de;
        if s.len() != Self::Value::len() {
        return Err(de::Error::invalid_value(de::Unexpected::Other(&format!("String is not {} characters long", Self::Value::len())),
        &self));
    }
        for c in s.chars() {
            if !c.is_ascii_hexdigit() {
                return Err(de::Error::invalid_value(de::Unexpected::Char(c), &self));
            }
        }

        let mut s = s.to_owned();
        s.make_ascii_uppercase();
        Ok(T::init(s))
    }
}

struct Sha256(String);
struct Sha256Visitor;

impl HexString for Sha256 {
    type T=Sha256;
    fn init(s: String) -> Sha256 {
        Sha256(s)
    }
    fn len() -> usize {
        32
    }
    fn visitor() -> Sha256Visitor {
        Sha256Visitor()
    }
}

impl HexStringVisitor for Sha256Visitor {
}

fn main() {
    let h: Sha256 = serde_json::from_str("a412").unwrap(); // should fail
}

Playground

这失败了,因为我不允许为 HexString

的任何实现者实现 Deserialize 特性

, this would be more obvious with const generics.

由于尚不存在,因此有两个主要选择。一种是到,另一种是使用数组。在这种情况下,使用数组是有意义的,因为无论如何你的数据都是固定长度的字节。

然后我会 Deserialize 实现一个新类型,其中包含可以生成的任何类型,然后作为字节集合访问:

extern crate hex;
extern crate serde;
extern crate serde_json;

use serde::de::Error;

#[derive(Debug)]
struct Hex<B>(B);

impl<'de, B> serde::de::Deserialize<'de> for Hex<B>
where
    B: AsMut<[u8]> + Default,
{
    fn deserialize<D>(deserializer: D) -> std::result::Result<Self, D::Error>
    where
        D: serde::de::Deserializer<'de>,
    {
        let s = String::deserialize(deserializer)?;
        let mut b = Hex(B::default());
        match hex::decode(s) {
            Ok(v) => {
                let expected_len = b.0.as_mut().len();
                if v.len() != expected_len {
                    Err(D::Error::custom(format_args!(
                        "Expected input of {} bytes, found {}",
                        expected_len,
                        v.len()
                    )))
                } else {
                    b.0.as_mut().copy_from_slice(&v);
                    Ok(b)
                }
            }
            Err(e) => Err(D::Error::custom(format_args!(
                "Unable to deserialize: {}",
                e
            ))),
        }
    }
}

type Sha16 = Hex<[u8; 2]>;
type Sha256 = Hex<[u8; 32]>;

const TWO_BYTES: &str = r#""a412""#;
const THIRTY_TWO_BYTES: &str =
    r#""2CF24DBA5FB0A30E26E83B2AC5B9E29E1B161E5C1FA7425E73043362938B9824""#;

fn main() {
    let h: Result<Sha256, _> = serde_json::from_str(TWO_BYTES);
    println!("{:?}", h);
    let h: Result<Sha16, _> = serde_json::from_str(TWO_BYTES);
    println!("{:?}", h);

    let h: Result<Sha256, _> = serde_json::from_str(THIRTY_TWO_BYTES);
    println!("{:?}", h);
    let h: Result<Sha16, _> = serde_json::from_str(THIRTY_TWO_BYTES);
    println!("{:?}", h);
}

这有两个潜在的低效率来源:

  1. 我们分配一个空数组,然后覆盖字节
  2. 我们分配一个Vec然后从中复制字节

有很多方法可以解决这些问题,但出于用户输入的目的,这可能已经足够合理了。

另请参阅:

  • How to transform fields during deserialization using Serde?