Beam:无法序列化和反序列化 属性 'awsCredentialsProvider
Beam: Failed to serialize and deserialize property 'awsCredentialsProvider
我一直在使用 Beam 管道 examples as a guide in an attempt to load files from S3 for my pipeline. Like in the examples I have defined my own PipelineOptions
that also extends S3Options and I am attempting to use the DefaultAWSCredentialsProviderChain。配置这个的代码是:
MyPipelineOptions options = PipelineOptionsFactory.fromArgs(args).as(MyPipelineOptions.class);
options.setAwsCredentialsProvider(new DefaultAWSCredentialsProviderChain());
options.setAwsRegion("us-east-1");
runPipeline(options);
当我从 Intellij 运行 它使用 Direct Runner 它工作正常
但是当我将它打包为 jar 并执行它时(也使用 Direct Runner)我看到:
Exception in thread "main" java.lang.IllegalArgumentException: PipelineOptions specified failed to serialize to JSON.
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:166)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
at a.b.c.beam.CleanSkeleton.runPipeline(CleanSkeleton.java:69)
at a.b.c.beam.CleanSkeleton.main(CleanSkeleton.java:53)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Unexpected IOException (of type java.io.IOException): Failed to serialize and deserialize property 'awsCredentialsProvider' with value 'com.amazonaws.auth.DefaultAWSCredentialsProviderChain@40f33492'
at com.fasterxml.jackson.databind.JsonMappingException.fromUnexpectedIOE(JsonMappingException.java:338)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsBytes(ObjectMapper.java:3247)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:163)
... 5 more
我正在使用 gradle 通过以下任务构建我的 jar:
jar {
manifest {
attributes (
'Main-Class': 'a.b.c.beam.CleanSkeleton'
)
}
from {
configurations.runtimeClasspath.collect { it.isDirectory() ? it : zipTree(it) }
}
from('src') {
include '/main/resources/*'
}
zip64 true
exclude 'META-INF/*.RSA', 'META-INF/*.SF', 'META-INF/*.DSA'
}
出现问题是因为在创建 fat/uber jar 时,META-INF/serivces
中的文件被重复文件覆盖。特别是 com.fasterxml.jackson.databind.Module
需要定义许多 Jackson 模块但缺失的地方。其中包括 org.apache.beam.sdk.io.aws.options.AwsModule
和 com.fasterxml.jackson.datatype.joda.JodaModule
。 DirectRunner
中的代码像这样实例化 ObjectMapper
:
new ObjectMapper()
.registerModules(ObjectMapper.findModules(ReflectHelpers.findClassLoader()));
ObjectMapper::findModules
依赖 java.util.ServiceLoader
从 META-INF/services/
文件中定位服务。
解决方案是使用 gradle Shadow plugin 构建 fat/uber jar 并将其配置为合并服务文件:
apply plugin: 'com.github.johnrengelman.shadow'
shadowJar {
mergeServiceFiles()
zip64 true
}
我一直在使用 Beam 管道 examples as a guide in an attempt to load files from S3 for my pipeline. Like in the examples I have defined my own PipelineOptions
that also extends S3Options and I am attempting to use the DefaultAWSCredentialsProviderChain。配置这个的代码是:
MyPipelineOptions options = PipelineOptionsFactory.fromArgs(args).as(MyPipelineOptions.class);
options.setAwsCredentialsProvider(new DefaultAWSCredentialsProviderChain());
options.setAwsRegion("us-east-1");
runPipeline(options);
当我从 Intellij 运行 它使用 Direct Runner 它工作正常 但是当我将它打包为 jar 并执行它时(也使用 Direct Runner)我看到:
Exception in thread "main" java.lang.IllegalArgumentException: PipelineOptions specified failed to serialize to JSON.
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:166)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299)
at a.b.c.beam.CleanSkeleton.runPipeline(CleanSkeleton.java:69)
at a.b.c.beam.CleanSkeleton.main(CleanSkeleton.java:53)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Unexpected IOException (of type java.io.IOException): Failed to serialize and deserialize property 'awsCredentialsProvider' with value 'com.amazonaws.auth.DefaultAWSCredentialsProviderChain@40f33492'
at com.fasterxml.jackson.databind.JsonMappingException.fromUnexpectedIOE(JsonMappingException.java:338)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsBytes(ObjectMapper.java:3247)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:163)
... 5 more
我正在使用 gradle 通过以下任务构建我的 jar:
jar {
manifest {
attributes (
'Main-Class': 'a.b.c.beam.CleanSkeleton'
)
}
from {
configurations.runtimeClasspath.collect { it.isDirectory() ? it : zipTree(it) }
}
from('src') {
include '/main/resources/*'
}
zip64 true
exclude 'META-INF/*.RSA', 'META-INF/*.SF', 'META-INF/*.DSA'
}
出现问题是因为在创建 fat/uber jar 时,META-INF/serivces
中的文件被重复文件覆盖。特别是 com.fasterxml.jackson.databind.Module
需要定义许多 Jackson 模块但缺失的地方。其中包括 org.apache.beam.sdk.io.aws.options.AwsModule
和 com.fasterxml.jackson.datatype.joda.JodaModule
。 DirectRunner
中的代码像这样实例化 ObjectMapper
:
new ObjectMapper()
.registerModules(ObjectMapper.findModules(ReflectHelpers.findClassLoader()));
ObjectMapper::findModules
依赖 java.util.ServiceLoader
从 META-INF/services/
文件中定位服务。
解决方案是使用 gradle Shadow plugin 构建 fat/uber jar 并将其配置为合并服务文件:
apply plugin: 'com.github.johnrengelman.shadow'
shadowJar {
mergeServiceFiles()
zip64 true
}