如何在 avro 模式中声明对象类型的实体

How to declare an entity of object type in avro schema

我有一个JSON对象,其中一部分如下:

{
  "bounding_box": {
    "coordinates": [
      [
        [
          -74.026675,
          40.683935
        ],
        [
          -74.026675,
          40.877483
        ]
      ]
    ],
    "type": "Polygon"
  }
}

此处,坐标作为对象数组发送。现在,对于这个 JSON 对象,我想创建 avro 模式(.avsc 文件),截至目前,它如下所示:

{
    "name": "bounding_box",
    "type": {
        "namespace": "my.tweet.stream",
        "type": "record",
        "name": "BoundingBox",
        "fields": [{
                "name": "coordinates",
                "type": {
                    "type": "array",
                    "items": "object"
                }
            },
            {
                "name": "type",
                "type": ["string", "null"]
            }
        ]
    }
}

但是,对于当前架构,我收到以下错误:

Execution generate-id of goal org.apache.avro:avro-maven-plugin:1.8.1:schema failed: Undefined name: "object"

有人可以帮忙吗,我如何指定 java.lang.Object 类型的 avro 模式?

谢谢。

Avro 是跨语言的,因此没有 java.lang.Object 映射,只有 record 类型,可以嵌套。

你可以嵌套数组(我只做了两层,但你应该可以有更多)

在 IDL 中 (payload.avdl)

@namespace("com.example.mycode.avro")
protocol ExampleProtocol {
  record BoundingBox {
    array<array<double>> coordinates;
   }

  record Payload {
    BoundingBox bounding_box;
    union{null, string} type = null;
  }
}

或在 AVSC

java -jar ~/Applications/avro-tools-1.8.1.jar idl2schemata payload.avdl

生成
{
  "type" : "record",
  "name" : "Payload",
  "namespace" : "com.example.mycode.avro",
  "fields" : [ {
    "name" : "bounding_box",
    "type" : {
      "type" : "record",
      "name" : "BoundingBox",
      "fields" : [ {
        "name" : "coordinates",
        "type" : {
          "type" : "array",
          "items" : {
            "type" : "array",
            "items" : "double"
          }
        }
      } ]
    }
  }, {
    "name" : "type",
    "type" : [ "null", "string" ],
    "default" : null
  } ]
}