尝试序列化包含数组的 Avro GenericRecord 时出现 NullPointerException
NullPointerException when attempting to serialize Avro GenericRecord containing array
我正在尝试发布 Avro(到 Kafka)并在尝试使用 BinaryEncoder
.
编写 Avro 对象时获得 NullPointerException
这里是简化的堆栈跟踪:
java.lang.NullPointerException: null of array of com.mycode.DeeplyNestedObject of array of com.mycode.NestedObject of union of com.mycode.ParentObject
at org.apache.avro.generic.GenericDatumWriter.npe(GenericDatumWriter.java:132) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:126) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60) ~[avro-1.8.1.jar:1.8.1]
at com.mycode.KafkaAvroPublisher.send(KafkaAvroPublisher.java:61) ~[classes/:na]
....
at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:112) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:87) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105) ~[avro-1.8.1.jar:1.8.1]
... 55 common frames omitted
这是我代码中发生异常的发送方法:
private static final EncoderFactory ENCODER_FACTORY = EncoderFactory.get();
private static final SpecificDatumWriter<ParentObject> PARENT_OBJECT_WRITER = new SpecificDatumWriter<>(ParentObject.SCHEMA$);
public void send(ParentObject parentObject) {
try {
ByteArrayOutputStream stream = new ByteArrayOutputStream();
binaryEncoder = ENCODER_FACTORY.binaryEncoder(stream, binaryEncoder);
PARENT_OBJECT_WRITER.write(parentObject, binaryEncoder); // Exception HERE
binaryEncoder.flush();
producer.send(new ProducerRecord<>(topic, stream.toByteArray()));
} catch (IOException ioe) {
logger.debug("Problem publishing message to Kafka.", ioe);
}
}
在架构中,NestedObject
包含一个 DeeplyNestedObject
数组。我已经进行了足够多的调试,发现 NestedObject
实际上包含一个 DeeplyNestedObject
数组,如果存在 none,则包含一个空数组。这是架构的相关部分:
[ { "namespace": "com.mycode.avro"
, "type": "record"
, "name": "NestedObject"
, "fields":
[ { "name": "timestamp", "type": "long", "doc": "Instant in time (milliseconds since epoch)." }
, { "name": "objs", "type": { "type": "array", "items": "DeeplyNestedObject" }, "doc": "Elided." }
]
}
]
我对您拥有的对象了解不够,但我在您的示例中看到的是您的 avro-schema 不正确。
avro中的DeeplyNestedObject是一个Record,所以你的schema必须是这样的:
{
"type": "record",
"name": "NestedObject",
"namespace": "com.mycode.avro",
"fields": [
{
"name": "timestamp",
"type": "long"
},
{
"name": "objs",
"type": {
"type": "record",
"name": "DeeplyNestedObject",
"fields": []
}
}
]
}
当然DeeplyNestedObject的所有字段都需要在"fields"中声明:[]与DeeplyNestedObject记录相关
Avro 的堆栈跟踪具有误导性。该问题可能比 Exception
消息指示的 class 更深一层。
当它说“null of array of com.mycode.DeeplyNestedObject of array of com.mycode.NestedObject of union of com.mycode.ParentObject
”时,这意味着 DeeplyNestedObject
中的一个字段应该是 array
但被发现是 null
. (将 DeeplyNestedObject
误解为 NestedObject
内部的 null
是完全有道理的。)
您需要检查 DeeplyNestedObject
的字段并找出哪个 array
没有被正确序列化。问题很可能出在创建 DeeplyNestedObject
的地方。它将有一个类型为 array
的字段,在调用发送方法之前,序列化程序不会在所有情况下填充该字段。
我正在尝试发布 Avro(到 Kafka)并在尝试使用 BinaryEncoder
.
NullPointerException
这里是简化的堆栈跟踪:
java.lang.NullPointerException: null of array of com.mycode.DeeplyNestedObject of array of com.mycode.NestedObject of union of com.mycode.ParentObject
at org.apache.avro.generic.GenericDatumWriter.npe(GenericDatumWriter.java:132) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:126) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60) ~[avro-1.8.1.jar:1.8.1]
at com.mycode.KafkaAvroPublisher.send(KafkaAvroPublisher.java:61) ~[classes/:na]
....
at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:112) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:87) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143) ~[avro-1.8.1.jar:1.8.1]
at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105) ~[avro-1.8.1.jar:1.8.1]
... 55 common frames omitted
这是我代码中发生异常的发送方法:
private static final EncoderFactory ENCODER_FACTORY = EncoderFactory.get();
private static final SpecificDatumWriter<ParentObject> PARENT_OBJECT_WRITER = new SpecificDatumWriter<>(ParentObject.SCHEMA$);
public void send(ParentObject parentObject) {
try {
ByteArrayOutputStream stream = new ByteArrayOutputStream();
binaryEncoder = ENCODER_FACTORY.binaryEncoder(stream, binaryEncoder);
PARENT_OBJECT_WRITER.write(parentObject, binaryEncoder); // Exception HERE
binaryEncoder.flush();
producer.send(new ProducerRecord<>(topic, stream.toByteArray()));
} catch (IOException ioe) {
logger.debug("Problem publishing message to Kafka.", ioe);
}
}
在架构中,NestedObject
包含一个 DeeplyNestedObject
数组。我已经进行了足够多的调试,发现 NestedObject
实际上包含一个 DeeplyNestedObject
数组,如果存在 none,则包含一个空数组。这是架构的相关部分:
[ { "namespace": "com.mycode.avro"
, "type": "record"
, "name": "NestedObject"
, "fields":
[ { "name": "timestamp", "type": "long", "doc": "Instant in time (milliseconds since epoch)." }
, { "name": "objs", "type": { "type": "array", "items": "DeeplyNestedObject" }, "doc": "Elided." }
]
}
]
我对您拥有的对象了解不够,但我在您的示例中看到的是您的 avro-schema 不正确。
avro中的DeeplyNestedObject是一个Record,所以你的schema必须是这样的:
{
"type": "record",
"name": "NestedObject",
"namespace": "com.mycode.avro",
"fields": [
{
"name": "timestamp",
"type": "long"
},
{
"name": "objs",
"type": {
"type": "record",
"name": "DeeplyNestedObject",
"fields": []
}
}
]
}
当然DeeplyNestedObject的所有字段都需要在"fields"中声明:[]与DeeplyNestedObject记录相关
Avro 的堆栈跟踪具有误导性。该问题可能比 Exception
消息指示的 class 更深一层。
当它说“null of array of com.mycode.DeeplyNestedObject of array of com.mycode.NestedObject of union of com.mycode.ParentObject
”时,这意味着 DeeplyNestedObject
中的一个字段应该是 array
但被发现是 null
. (将 DeeplyNestedObject
误解为 NestedObject
内部的 null
是完全有道理的。)
您需要检查 DeeplyNestedObject
的字段并找出哪个 array
没有被正确序列化。问题很可能出在创建 DeeplyNestedObject
的地方。它将有一个类型为 array
的字段,在调用发送方法之前,序列化程序不会在所有情况下填充该字段。