协议缓冲区，序列化数据中有什么？

Question

我是 protocol buffers 的新手，真的很想了解更多，所以很抱歉新手的问题。

什么是序列化数据，只有值还是键和值？我认为只有值，如果有人要反序列化它，he/she必须有scheme。

Answer 1

这在一定程度上取决于您是使用二进制形式（这通常是处理 protobuf 时的默认形式），还是 json 形式（是的，protobuf 包含一个 json 选项，位于至少在一些图书馆 - 不是全部）。

在二进制形式中，数据由字段数字和值组成；不是字段 names。例如，如果我们使用以下示例：

optional string name = 1; // remove the "optional" if using proto3 syntax

并分配一个值 "Nika" （并序列化它），然后二进制数据将包括 1 （以稍微调整的形式）和 UTF-8 编码形式的 Nika，但它将不包含"name"。

您绝对不需要具有模式来对其进行解码，但如果您这样做会使事情变得很多，因为规范的许多部分在其他方面是模棱两可的，对多种数据类型使用相同的"wire type"（即编码格式），或者对相同数据类型的多个含义（例如：如果没有架构（或一个很好的猜测），您无法判断一个整数是有符号的、无符号的还是 "zig zag encoded"，并且您获得的实际值可能会基于此有很大差异。

要查看您可以从没有架构的原始 protobuf 数据中理解什么，请尝试：https://protogen.marcgravell.com/decode

Answer 2

它既是键又是值：

As you know, a protocol buffer message is a series of key-value pairs. The binary version of a message just uses the field's number as the key – the name and declared type for each field can only be determined on the decoding end by referencing the message type's definition (i.e. the .proto file). https://developers.google.com/protocol-buffers/docs/encoding

例如，假设您有一个原型文件：

$  cat my.proto 
message header {
  required uint32 u1 = 1;
  required uint32 u2 = 2;
  optional uint32 u3 = 3 [default=0];
  optional bool   b1 = 4 [default=true];
  optional string s1 = 5;
  optional uint32 u4 = 6;
  optional uint32 u5 = 7;
  optional string s2 = 9;
  optional string s3   = 10; 
  optional uint32 u6 = 8;
}

从内存中转出编码数据：

(gdb) x/10xb 0x7fd70db7e964
0x7fd70db7e964: 0x08    0xff    0xff    0x01    0x10    0x08    0x40    0xf7
0x7fd70db7e96c: 0xd4    0x38

解码：

$ echo 08ffff01100840f7d438 | xxd -r -p | protoc --decode_raw
1: 32767
2: 8
8: 928375

1,2,8 是键

来自上面的原型文件：

1 => u1, 
2 => u2,
8 => u6

所以，它变成了：

u1: 32767
u2: 8
u6: 928375

我使用了问题中的数据 :

协议缓冲区，序列化数据中有什么？

protocol buffers, What is in serialized data?

java

protocol-buffers

protocol-buffers-3