构建 Impala 是否依赖于 Hive、HBase 和 Sentry?

Building Impala depends upon Hive, HBase and Sentry or not?

我有一个Hadoop集群,有一个master和3个slave。现在,我想在此集群上添加 Apache Impala 功能。我已经从 here 下载了 tar 球。我想构建 Impala,但不确定先决条件是什么。有两个不同的来源:

  1. This,来自文档,它说要求是:MySQL(或 PostgreSQL)、Hive metastore 和 Java 依赖项(显然)。
  2. 解tar球后在apache-impala目录下创建的README.md文件。引用它:

    Impala can be built with pre-built components, downloaded from S3, or can be built with an in-place toolchain located in the thirdparty directory (not recommended). The components needed to build Impala are Apache Hadoop, Hive, HBase, and Sentry.

我对这两个来源感到困惑。我应该怎么办? Apache Impala 的一组清晰的依赖关系会很棒!

如果您仔细阅读页面底部附近的 Impala Requirements you will see that Hadoop support is implied while the Sentry requirement is buried in the Impala Security link。

Java Dependencies 部分下它说:

All Java dependencies are packaged in the impala-dependencies.jar file, which is located at /usr/lib/impala/lib/. These map to everything that is built under fe/target/dependency.

查看相应的 pom.xml 你会看到所有的依赖项。 Grepping artifactId 显示以下内容:

$ grep artifactId fe/pom.xml 
    <artifactId>impala-parent</artifactId>
  <artifactId>impala-frontend</artifactId>
      <artifactId>json-smart</artifactId>
      <artifactId>impala-data-source-api</artifactId>
      <artifactId>hadoop-hdfs</artifactId>
      <artifactId>hadoop-common</artifactId>
          <artifactId>json-smart</artifactId>
      <artifactId>hadoop-auth</artifactId>
          <artifactId>json-smart</artifactId>
      <artifactId>hadoop-aws</artifactId>
      <artifactId>hadoop-azure-datalake</artifactId>
          <artifactId>json-smart</artifactId>
      <artifactId>sentry-core-common</artifactId>
      <artifactId>yarn-extras</artifactId>
      <artifactId>sentry-core-model-db</artifactId>
          <artifactId>json-smart</artifactId>
      <artifactId>sentry-provider-common</artifactId>
      <artifactId>sentry-provider-db</artifactId>
          <artifactId>json-smart</artifactId>
      <artifactId>sentry-provider-file</artifactId>
      <artifactId>sentry-provider-cache</artifactId>
          <artifactId>json-smart</artifactId>
      <artifactId>sentry-policy-common</artifactId>
      <artifactId>sentry-binding-hive</artifactId>
          <artifactId>json-smart</artifactId>
      <artifactId>sentry-policy-engine</artifactId>
      <artifactId>sentry-service-api</artifactId>
          <artifactId>json-smart</artifactId>
      <artifactId>parquet-hadoop-bundle</artifactId>
      <artifactId>hbase-client</artifactId>
           <artifactId>json-smart</artifactId>
      <artifactId>hbase-common</artifactId>
           <artifactId>json-smart</artifactId>
      <artifactId>hbase-protocol</artifactId>
      <artifactId>commons-lang</artifactId>
      <artifactId>java-cup</artifactId>
      <artifactId>libthrift</artifactId>
      <artifactId>hive-service</artifactId>
          <artifactId>hive-llap-server</artifactId>
          <artifactId>json-smart</artifactId>
      <artifactId>hive-serde</artifactId>

因此 README.md 正确地指出您需要 Hadoop、Hive、HBase 和 Sentry 来构建 Impala。