Sparksession 错误是关于配置单元的
Sparksession error is about hive
我的OS是windows10
from pyspark.conf import SparkConf
sc = SparkContext.getOrCreate()
spark = SparkSession.builder.enableHiveSupport().getOrCreate()
此代码给出以下错误
Py4JJavaError Traceback (most recent call
last)
~\Documents\spark\spark-2.1.0-bin-hadoop2.7\python\pyspark\sql\utils.py
in deco(*a, **kw)
62 try:
---> 63 return f(*a, **kw)
64 except py4j.protocol.Py4JJavaError as e:
~\Documents\spark\spark-2.1.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py
in get_return_value(answer, gateway_client, target_id, name)
318 "An error occurred while calling {0}{1}{2}.\n".
--> 319 format(target_id, ".", name), value)
320 else:
Py4JJavaError: An error occurred while calling o22.sessionState. :
java.lang.IllegalArgumentException: Error while instantiating
'org.apache.spark.sql.hive.HiveSessionState': at
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:981)
at
org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110)
at
org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606) at
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at
py4j.Gateway.invoke(Gateway.java:280) at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79) at
py4j.GatewayConnection.run(GatewayConnection.java:214) at
java.lang.Thread.run(Thread.java:745) Caused by:
java.lang.reflect.InvocationTargetException at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:978)
... 13 more Caused by: java.lang.IllegalArgumentException: Error
while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':
at
org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:169)
at
org.apache.spark.sql.internal.SharedState.(SharedState.scala:86)
at
org.apache.spark.sql.SparkSession$$anonfun$sharedState.apply(SparkSession.scala:101)
at
org.apache.spark.sql.SparkSession$$anonfun$sharedState.apply(SparkSession.scala:101)
at scala.Option.getOrElse(Option.scala:121) at
org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101)
at
org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100)
at
org.apache.spark.sql.internal.SessionState.(SessionState.scala:157)
at
org.apache.spark.sql.hive.HiveSessionState.(HiveSessionState.scala:32)
... 18 more Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method) at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at
org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166)
... 26 more Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method) at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
at
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:366)
at
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:270)
at
org.apache.spark.sql.hive.HiveExternalCatalog.(HiveExternalCatalog.scala:65)
... 31 more Caused by: java.lang.RuntimeException:
java.lang.RuntimeException: Unable to instantiate
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at
org.apache.spark.sql.hive.client.HiveClientImpl.(HiveClientImpl.scala:192)
... 39 more Caused by: java.lang.RuntimeException: Unable to
instantiate
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
我的完整代码在这里
from pyspark.sql import SQLContext
from pyspark.sql import SparkSession
import findspark
findspark.init('C:/Users/asus/Documents/spark/spark-2.1.0-bin-hadoop2.7')
import pyspark from pyspark.conf
import SparkConf sc = SparkContext.getOrCreate()
spark = SparkSession.builder.enableHiveSupport().getOrCreate()
从您发布的代码来看,您似乎是一名 Java 开发人员,或者您可能急于粘贴代码。在 python 中,您不会像我们在 Java 中那样用它们的类型编写变量,即
SparkContext sc =SparkContext.getOrCreate().
此外,从 Spark 2.0+ 版本开始,您需要创建一个 SparkSession 对象,它是您应用程序的入口点。您从该对象本身派生 SparkContext。尝试创建另一个 SparkContext "sc = SparkContext.getOrCreate()" 会导致错误。这是因为根据设计,在给定的单个 JVM 中只能有一个 SparkContext 运行。如果需要一个新的上下文,你需要用 sc.stop() 停止之前创建的 SparkContext。
从您的 stack-trace 和代码中可以看出,我还认为您正在本地测试您的应用程序,并且您的本地计算机上没有安装 Hadoop 和 Hive,这给您带来了错误:
Caused by: java.lang.RuntimeException: java.lang.RuntimeException:
Unable to instantiate
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at ...
您可以在您的 Windows 机器上安装 Hadoop 和 Hive 并尝试以下代码片段。
from pyspark.sql import SparkSession
spark = SparkSession \
.builder \
.appName('CalculatingGeoDistances') \
.enableHiveSupport() \
.getOrCreate()
sc = spark.sparkContext
我的OS是windows10
from pyspark.conf import SparkConf
sc = SparkContext.getOrCreate()
spark = SparkSession.builder.enableHiveSupport().getOrCreate()
此代码给出以下错误
Py4JJavaError Traceback (most recent call last) ~\Documents\spark\spark-2.1.0-bin-hadoop2.7\python\pyspark\sql\utils.py in deco(*a, **kw) 62 try: ---> 63 return f(*a, **kw) 64 except py4j.protocol.Py4JJavaError as e:
~\Documents\spark\spark-2.1.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py in get_return_value(answer, gateway_client, target_id, name) 318 "An error occurred while calling {0}{1}{2}.\n". --> 319 format(target_id, ".", name), value) 320 else:
Py4JJavaError: An error occurred while calling o22.sessionState. : java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState': at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:981) at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110) at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:280) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:214) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:978) ... 13 more Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog': at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:169) at org.apache.spark.sql.internal.SharedState.(SharedState.scala:86) at org.apache.spark.sql.SparkSession$$anonfun$sharedState.apply(SparkSession.scala:101) at org.apache.spark.sql.SparkSession$$anonfun$sharedState.apply(SparkSession.scala:101) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101) at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100) at org.apache.spark.sql.internal.SessionState.(SessionState.scala:157) at org.apache.spark.sql.hive.HiveSessionState.(HiveSessionState.scala:32) ... 18 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166) ... 26 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:366) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:270) at org.apache.spark.sql.hive.HiveExternalCatalog.(HiveExternalCatalog.scala:65) ... 31 more Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) at org.apache.spark.sql.hive.client.HiveClientImpl.(HiveClientImpl.scala:192) ... 39 more Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
我的完整代码在这里
from pyspark.sql import SQLContext
from pyspark.sql import SparkSession
import findspark
findspark.init('C:/Users/asus/Documents/spark/spark-2.1.0-bin-hadoop2.7')
import pyspark from pyspark.conf
import SparkConf sc = SparkContext.getOrCreate()
spark = SparkSession.builder.enableHiveSupport().getOrCreate()
从您发布的代码来看,您似乎是一名 Java 开发人员,或者您可能急于粘贴代码。在 python 中,您不会像我们在 Java 中那样用它们的类型编写变量,即 SparkContext sc =SparkContext.getOrCreate().
此外,从 Spark 2.0+ 版本开始,您需要创建一个 SparkSession 对象,它是您应用程序的入口点。您从该对象本身派生 SparkContext。尝试创建另一个 SparkContext "sc = SparkContext.getOrCreate()" 会导致错误。这是因为根据设计,在给定的单个 JVM 中只能有一个 SparkContext 运行。如果需要一个新的上下文,你需要用 sc.stop() 停止之前创建的 SparkContext。
从您的 stack-trace 和代码中可以看出,我还认为您正在本地测试您的应用程序,并且您的本地计算机上没有安装 Hadoop 和 Hive,这给您带来了错误:
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) at ...
您可以在您的 Windows 机器上安装 Hadoop 和 Hive 并尝试以下代码片段。
from pyspark.sql import SparkSession
spark = SparkSession \
.builder \
.appName('CalculatingGeoDistances') \
.enableHiveSupport() \
.getOrCreate()
sc = spark.sparkContext