spark - 线程异常 "main" java.sql.SQLException: 没有合适的驱动程序

spark - Exception in thread "main" java.sql.SQLException: No suitable driver

已解决: prop.setProperty("driver", "oracle.jdbc.driver.OracleDriver") 这一行必须添加到连接属性中。

我想在当地吃一份 spark 工作。我通过 maven 创建了一个具有依赖关系的 jar。

这是我的pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.agildata</groupId>
    <artifactId>spark-rdd-dataframe-dataset</artifactId>
    <packaging>jar</packaging>
    <version>1.0</version>

    <properties>    
        <exec-maven-plugin.version>1.4.0</exec-maven-plugin.version>
        <spark.version>1.6.0</spark.version>
    </properties>

    <dependencies>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.11</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>com.oracle</groupId>
            <artifactId>ojdbc7</artifactId>
            <version>12.1.0.2</version>
        </dependency>





    </dependencies>

    <build>
        <plugins>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.2</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>

            <plugin>
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <executions>
                    <execution>
                        <id>scala-compile-first</id>
                        <phase>process-resources</phase>
                        <goals>
                            <goal>add-source</goal>
                            <goal>compile</goal>
                        </goals>
                    </execution>
                    <execution>
                        <id>scala-test-compile</id>
                        <phase>process-test-resources</phase>
                        <goals>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>


            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.4.1</version>
                <configuration>
                    <!-- get all project dependencies -->
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <!-- MainClass in mainfest make a executable jar -->
                    <archive>
                        <manifest>
                            <mainClass>example.dataframe.ScalaDataFrameExample</mainClass>
                        </manifest>
                    </archive>

                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <!-- bind to the packaging phase -->
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>


        </plugins>
    </build>

</project>

我运行 mvn package 命令,构建成功。在我尝试 运行 这样的工作之后: GMAC:bin gabor_dev$ sh spark-submit --class example.dataframe.ScalaDataFrameExample --master spark://QGMAC.local:7077 /Users/gabor_dev/IdeaProjects/dataframe/target/spark-rdd-dataframe-dataset-1.0-jar-with-dependencies.jar 但它抛出这个: Exception in thread "main" java.sql.SQLException: No suitable driver

完整错误信息:

16/07/08 13:09:22 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.sql.SQLException: No suitable driver
    at java.sql.DriverManager.getDriver(DriverManager.java:315)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun.apply(JdbcUtils.scala:50)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun.apply(JdbcUtils.scala:50)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:49)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:120)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:91)
    at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:222)
    at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)
    at example.dataframe.ScalaDataFrameExample$.main(ScalaDataFrameExample.scala:30)
    at example.dataframe.ScalaDataFrameExample.main(ScalaDataFrameExample.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/07/08 13:09:22 INFO SparkContext: Invoking stop() from shutdown hook

有趣的是,如果我在 IntelliJ IDEA 嵌套控制台中以这种方式构建:mvn package exec:java -Dexec.mainClass=example.dataframe.ScalaDataFrameExample 它是 运行ning,并且没有错误。

这是相关的 Scala 代码部分:

val sc = new SparkContext(conf)

    val sqlContext = new SQLContext(sc)

    val url="jdbc:oracle:thin:@xxx.xxx.xx:1526:SIDNAME"

    val prop = new java.util.Properties

      prop.setProperty("user" , "usertst")
      prop.setProperty("password" , "usertst")

      val people = sqlContext.read.jdbc(url,"table_name",prop)

      people.show()

我检查了我的 jar 文件,它包含所有依赖项。谁能帮我解决这个问题。谢谢!

因此,缺少的驱动程序是 JDBC 驱动程序,您必须将其添加到 SparkSQL 配置中。您可以按照指定 by this answer 在应用程序提交中执行此操作,或者像您所做的那样通过 Properties 对象执行此操作:

prop.setProperty("driver", "oracle.jdbc.driver.OracleDriver") 

以下是您将如何使用 spark 连接到 postgresql。

SparkSession sparkSession = SparkSession.builder().
            appName("dky").
            master("local[*]").
            getOrCreate();

    Logger.getLogger("org.apache").setLevel(Level.WARN);

    Properties properties = new Properties();
    properties.put("user", "your user name");
    properties.put("password", "your password");

    Dataset<Row> jdbcDF = sparkSession.read().option("driver", "org.postgresql.Driver")
            .jdbc("jdbc:postgresql://localhost:5432/postgres", "your table name along with schema name", properties);
    jdbcDF.show();