为什么我的 RabbitMQ 消息无法使用 Apache Beam 进行序列化?

Why is my RabbitMQ message impossible to serialize using Apache Beam?

我正在尝试使用 Apache Beam 读取 RabbitMQ 队列。 我写了一些转换代码来将消息写入 Kafka。 我什至使用简单的短信测试了我的场景。

现在我尝试使用此转换器生成的有效消息来部署它 运行。这些 JSON 个大小适中的消息。

奇怪的是,当我尝试阅读 "production" 消息时,我得到了这个异常堆栈跟踪。

java.lang.IllegalArgumentException: Unable to encode element 'ValueWithRecordId{id=[], value=org.apache.beam.sdk.io.rabbitmq.RabbitMqMessage@f179a7f}' with coder 'ValueWithRecordId$ValueWithRecordIdCoder(org.apache.beam.sdk.coders.SerializableCoder@76190fb2)'.
        org.apache.beam.sdk.coders.Coder.getEncodedElementByteSize(Coder.java:300)
        org.apache.beam.sdk.coders.Coder.registerByteSizeObserver(Coder.java:291)
        org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.registerByteSizeObserver(WindowedValue.java:564)
        org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.registerByteSizeObserver(WindowedValue.java:480)
        org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$ElementByteSizeObservableCoder.registerByteSizeObserver(IntrinsicMapTaskExecutorFactory.java:400)
        org.apache.beam.runners.dataflow.worker.util.common.worker.OutputObjectAndByteCounter.update(OutputObjectAndByteCounter.java:125)
        org.apache.beam.runners.dataflow.worker.DataflowOutputCounter.update(DataflowOutputCounter.java:64)
        org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:43)
        org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:201)
        org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:159)
        org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77)
        org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1283)
        org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access00(StreamingDataflowWorker.java:147)
        org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.run(StreamingDataflowWorker.java:1020)
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        java.lang.Thread.run(Thread.java:745)
Caused by: java.io.NotSerializableException: com.rabbitmq.client.impl.LongStringHelper$ByteArrayLongString
        java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
        java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
        java.util.HashMap.internalWriteEntries(HashMap.java:1785)
        java.util.HashMap.writeObject(HashMap.java:1362)
        sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        java.lang.reflect.Method.invoke(Method.java:498)
        java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028)
        java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
        java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
        java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
        java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
        java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
        java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
        java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
        java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
        org.apache.beam.sdk.coders.SerializableCoder.encode(SerializableCoder.java:183)
        org.apache.beam.sdk.coders.SerializableCoder.encode(SerializableCoder.java:53)
        org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:105)
        org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:99)
        org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:81)
        org.apache.beam.sdk.coders.Coder.getEncodedElementByteSize(Coder.java:297)
        org.apache.beam.sdk.coders.Coder.registerByteSizeObserver(Coder.java:291)
        org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.registerByteSizeObserver(WindowedValue.java:564)
        org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.registerByteSizeObserver(WindowedValue.java:480)
        org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$ElementByteSizeObservableCoder.registerByteSizeObserver(IntrinsicMapTaskExecutorFactory.java:400)
        org.apache.beam.runners.dataflow.worker.util.common.worker.OutputObjectAndByteCounter.update(OutputObjectAndByteCounter.java:125)
        org.apache.beam.runners.dataflow.worker.DataflowOutputCounter.update(DataflowOutputCounter.java:64)
        org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:43)
        org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:201)
        org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:159)
        org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77)
        org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1283)
        org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access00(StreamingDataflowWorker.java:147)
        org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.run(StreamingDataflowWorker.java:1020)
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        java.lang.Thread.run(Thread.java:745)

我的理解是 RabbitMQ reader 认为消息足够大,需要使用 LongString,这是不可序列化的。

我在这一点上是否正确?如果是这样,我如何建议 RabbitMQ 使用简单的字符串(对于这些消息就足够了)?

这是一个 Apache Beam (https://issues.apache.org/jira/browse/BEAM-7414) for which solution is ... not yet merged into Apache Beam repo by pure laziness (this is bad). If someone wants to have the fix immediatly, it is possible to build my branch https://github.com/Riduidel/beam/tree/fix/rabbitmq-message-not-serializable