在 Spark 应用程序中使用 Apache Directory API
Using Apache Directory API in Spark application
我正在尝试使用 org.apache.directory.api
创建到 LDAP 服务的连接并将其作为 Spark 应用程序的一部分进行查询。当我将它用作 Java 应用程序的一部分时,用于连接和查询 LDAP 的 Scala 代码按预期工作,但是当作为 Spark 应用程序的一部分执行时,它会产生如下错误消息:
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.directory.api.util.Strings.toLowerCaseAscii(Ljava/lang/String;)Ljava/lang/String;
at org.apache.directory.api.ldap.codec.api.DefaultConfigurableBinaryAttributeDetector.addBinaryAttribute(DefaultConfigurableBinaryAttributeDetector.java:166)
at org.apache.directory.api.ldap.codec.api.DefaultConfigurableBinaryAttributeDetector.setBinaryAttributes(DefaultConfigurableBinaryAttributeDetector.java:206)
at org.apache.directory.api.ldap.codec.api.DefaultConfigurableBinaryAttributeDetector.<init>(DefaultConfigurableBinaryAttributeDetector.java:133)
at org.apache.directory.ldap.client.api.LdapNetworkConnection.buildConfig(LdapNetworkConnection.java:599)
at org.apache.directory.ldap.client.api.LdapNetworkConnection.<init>(LdapNetworkConnection.java:410)
我第一次尝试创建网络连接时出现异常:
val ldapConnection = new LdapNetworkConnection(endpoint, port, true)
在依赖关系树中,我可以看到 api-util
也是 Spark 依赖关系的一部分,但在树中它被标记为由于与我的版本冲突而被省略 - 但是因为提供了这个 jar我不确定它是否首先加载,因此我的依赖项被忽略:
[INFO] +- org.apache.spark:spark-core_2.11:jar:2.3.0.cloudera2:provided
[INFO] | +- org.apache.hadoop:hadoop-client:jar:2.6.0-cdh5.13.3:provided
[INFO] | | +- org.apache.hadoop:hadoop-common:jar:2.6.0-cdh5.13.3:provided
[INFO] | | | +- org.apache.hadoop:hadoop-auth:jar:2.6.0-cdh5.13.3:provided
[INFO] | | | | +- org.apache.directory.server:apacheds-kerberos-codec:jar:2.0.0-M15:provided
[INFO] | | | | | +- (org.apache.directory.api:api-util:jar:1.0.0-M20:provided - omitted for conflict with 1.0.3)
我不明白为什么这个调用会在一个不存在的方法中结束,或者可能是什么错误。任何建议如何解决或调试这个?
我找到了解决方案,这是由于 Spark 依赖于不同版本的 LDAP 包。我通过隐藏所需的 Apache 包来解决它,如下所示:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<configuration>
<shadedArtifactAttached>true</shadedArtifactAttached>
<shadedClassifierName>${jarNameWithDependencies}</shadedClassifierName>
<artifactSet>
<includes>
<include>*:*</include>
</includes>
</artifactSet>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<relocations>
<relocation>
<pattern>org.apache.directory</pattern>
<shadedPattern>org.apache.shaded.directory</shadedPattern>
</relocation>
<relocation>
<pattern>org.apache.mina</pattern>
<shadedPattern>org.apache.shaded.mina</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>
我正在尝试使用 org.apache.directory.api
创建到 LDAP 服务的连接并将其作为 Spark 应用程序的一部分进行查询。当我将它用作 Java 应用程序的一部分时,用于连接和查询 LDAP 的 Scala 代码按预期工作,但是当作为 Spark 应用程序的一部分执行时,它会产生如下错误消息:
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.directory.api.util.Strings.toLowerCaseAscii(Ljava/lang/String;)Ljava/lang/String;
at org.apache.directory.api.ldap.codec.api.DefaultConfigurableBinaryAttributeDetector.addBinaryAttribute(DefaultConfigurableBinaryAttributeDetector.java:166)
at org.apache.directory.api.ldap.codec.api.DefaultConfigurableBinaryAttributeDetector.setBinaryAttributes(DefaultConfigurableBinaryAttributeDetector.java:206)
at org.apache.directory.api.ldap.codec.api.DefaultConfigurableBinaryAttributeDetector.<init>(DefaultConfigurableBinaryAttributeDetector.java:133)
at org.apache.directory.ldap.client.api.LdapNetworkConnection.buildConfig(LdapNetworkConnection.java:599)
at org.apache.directory.ldap.client.api.LdapNetworkConnection.<init>(LdapNetworkConnection.java:410)
我第一次尝试创建网络连接时出现异常:
val ldapConnection = new LdapNetworkConnection(endpoint, port, true)
在依赖关系树中,我可以看到 api-util
也是 Spark 依赖关系的一部分,但在树中它被标记为由于与我的版本冲突而被省略 - 但是因为提供了这个 jar我不确定它是否首先加载,因此我的依赖项被忽略:
[INFO] +- org.apache.spark:spark-core_2.11:jar:2.3.0.cloudera2:provided
[INFO] | +- org.apache.hadoop:hadoop-client:jar:2.6.0-cdh5.13.3:provided
[INFO] | | +- org.apache.hadoop:hadoop-common:jar:2.6.0-cdh5.13.3:provided
[INFO] | | | +- org.apache.hadoop:hadoop-auth:jar:2.6.0-cdh5.13.3:provided
[INFO] | | | | +- org.apache.directory.server:apacheds-kerberos-codec:jar:2.0.0-M15:provided
[INFO] | | | | | +- (org.apache.directory.api:api-util:jar:1.0.0-M20:provided - omitted for conflict with 1.0.3)
我不明白为什么这个调用会在一个不存在的方法中结束,或者可能是什么错误。任何建议如何解决或调试这个?
我找到了解决方案,这是由于 Spark 依赖于不同版本的 LDAP 包。我通过隐藏所需的 Apache 包来解决它,如下所示:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<configuration>
<shadedArtifactAttached>true</shadedArtifactAttached>
<shadedClassifierName>${jarNameWithDependencies}</shadedClassifierName>
<artifactSet>
<includes>
<include>*:*</include>
</includes>
</artifactSet>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<relocations>
<relocation>
<pattern>org.apache.directory</pattern>
<shadedPattern>org.apache.shaded.directory</shadedPattern>
</relocation>
<relocation>
<pattern>org.apache.mina</pattern>
<shadedPattern>org.apache.shaded.mina</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>