来自 jupyterhub 的 Kubernetes 上的 Spark 不与 IRSA 一起工作
Spark on Kubernetes from jupyterhub not working with IRSA
我正在尝试从 jupyterhub 运行 spark 针对使用 IRSA.I 中介绍的示例的 eks 集群
AWS EKS Spark 3.0, Hadoop 3.2 Error - NoClassDefFoundError: com/amazonaws/services/s3/model/MultiObjectDeleteException
https://medium.com/swlh/how-to-perform-a-spark-submit-to-amazon-eks-cluster-with-irsa-50af9b26cae 用于代码和 IRSA 角色示例。但是,我无法通过区域提供商链错误找到区域。我尝试使用不同的 spark 版本和 aws sdk 版本以及硬编码的 AWS_DEFAULT_REGION 值,但它没有解决问题。感谢任何有关解决问题的建议。
SPARK_HADOOP_VERSION="3.2"
HADOOP_VERSION="3.2.0"
SPARK_VERSION="3.0.1"
AWS_VERSION="1.11.874"
TINI_VERSION="0.18.0"
我正在我的 spark jars 文件夹中添加以下 jars,
"https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/${HADOOP_VERSION}/hadoop-aws-${HADOOP_VERSION}.jar"
"https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/${AWS_VERSION}/aws-java-sdk-bundle-${AWS_VERSION}.jar"
"https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk/${AWS_VERSION}/aws-java-sdk-${AWS_VERSION}.jar"
"https://s3.amazonaws.com/redshift-downloads/drivers/jdbc/1.2.51.1078/RedshiftJDBC42-no-awssdk-1.2.51.1078.jar"
示例 spark 会话生成器代码
SPARK_DRIVER_PACKAGES = ['org.apache.spark:spark-core_2.12:3.0.1',
'org.apache.spark:spark-avro_2.12:3.0.1',
'org.apache.spark:spark-sql_2.12:3.0.1',
'io.github.spark-redshift-community:spark-redshift_2.12:4.2.0',
'org.postgresql:postgresql:42.2.14',
'mysql:mysql-connector-java:8.0.22',
'org.apache.hadoop:hadoop-aws:3.2.0',
'com.amazonaws:aws-java-sdk-bundle:1.11.874']
spark_session = SparkSession.builder.master(master_host)\
.appName("pyspark_session_app_1")\
.config('spark.driver.host', local_ip)\
.config('spark.kubernetes.authenticate.driver.serviceAccountName', 'spark')\
.config('spark.kubernetes.authenticate.executor.serviceAccountName', 'spark')\
.config("spark.kubernetes.executor.annotation.eks.amazonaws.com/role-arn","arn:aws:iam::xxxxxxxxx:role/spark-irsa") \
.config('spark.kubernetes.driver.limit.cores', 0.2)\
.config('spark.hadoop.fs.s3a.aws.credentials.provider','com.amazonaws.auth.WebIdentityTokenCredentialsProvider')\
.config("spark.kubernetes.authenticate.submission.caCertFile", "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt") \
.config("spark.kubernetes.authenticate.submission.oauthTokenFile", "/var/run/secrets/kubernetes.io/serviceaccount/token")\
.config('spark.kubernetes.executor.request.cores', executor_cores)\
.config('spark.executor.instances', executor_instances)\
.config('spark.executor.memory', executor_memory)\
.config('spark.driver.memory', driver_memory)\
.config('spark.kubernetes.executor.limit.cores', 1)\
.config('spark.scheduler.mode', 'FAIR')\
.config('spark.submit.deployMode', 'client')\
.config('spark.kubernetes.container.image', SPARK_IMAGE)\
.config('spark.kubernetes.container.image.pullPolicy', 'Always')\
.config('spark.kubernetes.namespace', 'prod-data-science')\
.config('spark.sql.execution.arrow.pyspark.enabled', 'true')\
.config('spark.sql.execution.arrow.pyspark.fallback.enabled', 'true')\
.config('spark.executorEnv.ARROW_PRE_0_15_IPC_FORMAT', '1')\
.config('spark.jars.packages', ','.join(SPARK_DRIVER_PACKAGES))\
.config("spark.hadoop.fs.s3a.multiobjectdelete.enable", "false") \
.config("spark.hadoop.fs.s3a.fast.upload","true") \
.config("spark.hadoop.fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem") \
.config('spark.eventLog.enabled','true')\
.config('spark.eventLog.dir','s3a://spark-logs-xxxx/')\
.getOrCreate()
收到错误消息
Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.nio.file.AccessDeniedException: spark-logs-xxxx: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by WebIdentityTokenCredentialsProvider : com.amazonaws.SdkClientException: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region.
at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:187)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry(Invoker.java:265)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:261)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:236)
at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:375)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:311)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303)
at org.apache.hadoop.fs.FileSystem.access0(FileSystem.java:124)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1853)
at org.apache.spark.deploy.history.EventLogFileWriter.<init>(EventLogFileWriters.scala:60)
at org.apache.spark.deploy.history.SingleEventLogFileWriter.<init>(EventLogFileWriters.scala:213)
at org.apache.spark.deploy.history.EventLogFileWriter$.apply(EventLogFileWriters.scala:181)
at org.apache.spark.scheduler.EventLoggingListener.<init>(EventLoggingListener.scala:64)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:576)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by WebIdentityTokenCredentialsProvider : com.amazonaws.SdkClientException: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region.
at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:159)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1257)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:833)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:783)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access0(AmazonHttpClient.java:704)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5212)
at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6013)
at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:5986)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5196)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5158)
at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1421)
at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1357)
at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists(S3AFileSystem.java:376)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
... 29 more
Caused by: com.amazonaws.SdkClientException: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region.
at com.amazonaws.client.builder.AwsClientBuilder.setRegion(AwsClientBuilder.java:462)
at com.amazonaws.client.builder.AwsClientBuilder.configureMutableProperties(AwsClientBuilder.java:424)
at com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:46)
at com.amazonaws.auth.STSAssumeRoleWithWebIdentitySessionCredentialsProvider.buildStsClient(STSAssumeRoleWithWebIdentitySessionCredentialsProvider.java:125)
at com.amazonaws.auth.STSAssumeRoleWithWebIdentitySessionCredentialsProvider.<init>(STSAssumeRoleWithWebIdentitySessionCredentialsProvider.java:97)
at com.amazonaws.auth.STSAssumeRoleWithWebIdentitySessionCredentialsProvider.<init>(STSAssumeRoleWithWebIdentitySessionCredentialsProvider.java:40)
at com.amazonaws.auth.STSAssumeRoleWithWebIdentitySessionCredentialsProvider$Builder.build(STSAssumeRoleWithWebIdentitySessionCredentialsProvider.java:226)
at com.amazonaws.services.securitytoken.internal.STSProfileCredentialsService.getAssumeRoleCredentialsProvider(STSProfileCredentialsService.java:40)
at com.amazonaws.auth.profile.internal.securitytoken.STSProfileCredentialsServiceProvider.getProfileCredentialsProvider(STSProfileCredentialsServiceProvider.java:39)
at com.amazonaws.auth.profile.internal.securitytoken.STSProfileCredentialsServiceProvider.getCredentials(STSProfileCredentialsServiceProvider.java:71)
at com.amazonaws.auth.WebIdentityTokenCredentialsProvider.getCredentials(WebIdentityTokenCredentialsProvider.java:76)
at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:137)
... 47 more
确定问题是与 spark pods.Included 关联的 aws 区域是基本 spark docker 图像上的区域变量,问题已解决
我正在尝试从 jupyterhub 运行 spark 针对使用 IRSA.I 中介绍的示例的 eks 集群 AWS EKS Spark 3.0, Hadoop 3.2 Error - NoClassDefFoundError: com/amazonaws/services/s3/model/MultiObjectDeleteException https://medium.com/swlh/how-to-perform-a-spark-submit-to-amazon-eks-cluster-with-irsa-50af9b26cae 用于代码和 IRSA 角色示例。但是,我无法通过区域提供商链错误找到区域。我尝试使用不同的 spark 版本和 aws sdk 版本以及硬编码的 AWS_DEFAULT_REGION 值,但它没有解决问题。感谢任何有关解决问题的建议。
SPARK_HADOOP_VERSION="3.2"
HADOOP_VERSION="3.2.0"
SPARK_VERSION="3.0.1"
AWS_VERSION="1.11.874"
TINI_VERSION="0.18.0"
我正在我的 spark jars 文件夹中添加以下 jars,
"https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/${HADOOP_VERSION}/hadoop-aws-${HADOOP_VERSION}.jar"
"https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/${AWS_VERSION}/aws-java-sdk-bundle-${AWS_VERSION}.jar"
"https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk/${AWS_VERSION}/aws-java-sdk-${AWS_VERSION}.jar"
"https://s3.amazonaws.com/redshift-downloads/drivers/jdbc/1.2.51.1078/RedshiftJDBC42-no-awssdk-1.2.51.1078.jar"
示例 spark 会话生成器代码
SPARK_DRIVER_PACKAGES = ['org.apache.spark:spark-core_2.12:3.0.1',
'org.apache.spark:spark-avro_2.12:3.0.1',
'org.apache.spark:spark-sql_2.12:3.0.1',
'io.github.spark-redshift-community:spark-redshift_2.12:4.2.0',
'org.postgresql:postgresql:42.2.14',
'mysql:mysql-connector-java:8.0.22',
'org.apache.hadoop:hadoop-aws:3.2.0',
'com.amazonaws:aws-java-sdk-bundle:1.11.874']
spark_session = SparkSession.builder.master(master_host)\
.appName("pyspark_session_app_1")\
.config('spark.driver.host', local_ip)\
.config('spark.kubernetes.authenticate.driver.serviceAccountName', 'spark')\
.config('spark.kubernetes.authenticate.executor.serviceAccountName', 'spark')\
.config("spark.kubernetes.executor.annotation.eks.amazonaws.com/role-arn","arn:aws:iam::xxxxxxxxx:role/spark-irsa") \
.config('spark.kubernetes.driver.limit.cores', 0.2)\
.config('spark.hadoop.fs.s3a.aws.credentials.provider','com.amazonaws.auth.WebIdentityTokenCredentialsProvider')\
.config("spark.kubernetes.authenticate.submission.caCertFile", "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt") \
.config("spark.kubernetes.authenticate.submission.oauthTokenFile", "/var/run/secrets/kubernetes.io/serviceaccount/token")\
.config('spark.kubernetes.executor.request.cores', executor_cores)\
.config('spark.executor.instances', executor_instances)\
.config('spark.executor.memory', executor_memory)\
.config('spark.driver.memory', driver_memory)\
.config('spark.kubernetes.executor.limit.cores', 1)\
.config('spark.scheduler.mode', 'FAIR')\
.config('spark.submit.deployMode', 'client')\
.config('spark.kubernetes.container.image', SPARK_IMAGE)\
.config('spark.kubernetes.container.image.pullPolicy', 'Always')\
.config('spark.kubernetes.namespace', 'prod-data-science')\
.config('spark.sql.execution.arrow.pyspark.enabled', 'true')\
.config('spark.sql.execution.arrow.pyspark.fallback.enabled', 'true')\
.config('spark.executorEnv.ARROW_PRE_0_15_IPC_FORMAT', '1')\
.config('spark.jars.packages', ','.join(SPARK_DRIVER_PACKAGES))\
.config("spark.hadoop.fs.s3a.multiobjectdelete.enable", "false") \
.config("spark.hadoop.fs.s3a.fast.upload","true") \
.config("spark.hadoop.fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem") \
.config('spark.eventLog.enabled','true')\
.config('spark.eventLog.dir','s3a://spark-logs-xxxx/')\
.getOrCreate()
收到错误消息
Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.nio.file.AccessDeniedException: spark-logs-xxxx: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by WebIdentityTokenCredentialsProvider : com.amazonaws.SdkClientException: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region.
at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:187)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry(Invoker.java:265)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:261)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:236)
at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:375)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:311)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303)
at org.apache.hadoop.fs.FileSystem.access0(FileSystem.java:124)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1853)
at org.apache.spark.deploy.history.EventLogFileWriter.<init>(EventLogFileWriters.scala:60)
at org.apache.spark.deploy.history.SingleEventLogFileWriter.<init>(EventLogFileWriters.scala:213)
at org.apache.spark.deploy.history.EventLogFileWriter$.apply(EventLogFileWriters.scala:181)
at org.apache.spark.scheduler.EventLoggingListener.<init>(EventLoggingListener.scala:64)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:576)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by WebIdentityTokenCredentialsProvider : com.amazonaws.SdkClientException: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region.
at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:159)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1257)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:833)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:783)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access0(AmazonHttpClient.java:704)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5212)
at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6013)
at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:5986)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5196)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5158)
at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1421)
at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1357)
at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists(S3AFileSystem.java:376)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
... 29 more
Caused by: com.amazonaws.SdkClientException: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region.
at com.amazonaws.client.builder.AwsClientBuilder.setRegion(AwsClientBuilder.java:462)
at com.amazonaws.client.builder.AwsClientBuilder.configureMutableProperties(AwsClientBuilder.java:424)
at com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:46)
at com.amazonaws.auth.STSAssumeRoleWithWebIdentitySessionCredentialsProvider.buildStsClient(STSAssumeRoleWithWebIdentitySessionCredentialsProvider.java:125)
at com.amazonaws.auth.STSAssumeRoleWithWebIdentitySessionCredentialsProvider.<init>(STSAssumeRoleWithWebIdentitySessionCredentialsProvider.java:97)
at com.amazonaws.auth.STSAssumeRoleWithWebIdentitySessionCredentialsProvider.<init>(STSAssumeRoleWithWebIdentitySessionCredentialsProvider.java:40)
at com.amazonaws.auth.STSAssumeRoleWithWebIdentitySessionCredentialsProvider$Builder.build(STSAssumeRoleWithWebIdentitySessionCredentialsProvider.java:226)
at com.amazonaws.services.securitytoken.internal.STSProfileCredentialsService.getAssumeRoleCredentialsProvider(STSProfileCredentialsService.java:40)
at com.amazonaws.auth.profile.internal.securitytoken.STSProfileCredentialsServiceProvider.getProfileCredentialsProvider(STSProfileCredentialsServiceProvider.java:39)
at com.amazonaws.auth.profile.internal.securitytoken.STSProfileCredentialsServiceProvider.getCredentials(STSProfileCredentialsServiceProvider.java:71)
at com.amazonaws.auth.WebIdentityTokenCredentialsProvider.getCredentials(WebIdentityTokenCredentialsProvider.java:76)
at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:137)
... 47 more
确定问题是与 spark pods.Included 关联的 aws 区域是基本 spark docker 图像上的区域变量,问题已解决