ipython 带有 spark 的笔记本在 sparkcontext 中出现错误
ipython notebook with spark gets error with sparkcontext
我正在我的 macbook 上用这个例子测试 turi osx 10.10.5
https://turi.com/learn/gallery/notebooks/spark_and_graphlab_create.html
当到达这一步时
# Set up the SparkContext object
# this can be 'local' or 'yarn-client' in PySpark
# Remember if using yarn-client then all the paths should be accessible
# by all nodes in the cluster.
sc = SparkContext('local')
出现以下错误
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
<ipython-input-12-dc1befb4186c> in <module>()
3 # Remember if using yarn-client then all the paths should be accessible
4 # by all nodes in the cluster.
----> 5 sc = SparkContext()
/usr/local/Cellar/apache-spark/1.6.2/libexec/python/pyspark/context.pyc in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
110 """
111 self._callsite = first_spark_call() or CallSite(None, None, None)
--> 112 SparkContext._ensure_initialized(self, gateway=gateway)
113 try:
114 self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
/usr/local/Cellar/apache-spark/1.6.2/libexec/python/pyspark/context.pyc in _ensure_initialized(cls, instance, gateway)
243 with SparkContext._lock:
244 if not SparkContext._gateway:
--> 245 SparkContext._gateway = gateway or launch_gateway()
246 SparkContext._jvm = SparkContext._gateway.jvm
247
/usr/local/Cellar/apache-spark/1.6.2/libexec/python/pyspark/java_gateway.pyc in launch_gateway()
92 callback_socket.close()
93 if gateway_port is None:
---> 94 raise Exception("Java gateway process exited before sending the driver its port number")
95
96 # In Windows, ensure the Java child processes do not linger after Python has exited.
Exception: Java gateway process exited before sending the driver its port number
快速 google 搜索还没有帮助。
这是我的。bash_profile
# added by Anaconda2 4.1.1 installer
export PATH="/Users/me/anaconda/bin:$PATH"
export SCALA_HOME=/usr/local/Cellar/scala/2.11.8/libexec
export SPARK_HOME=/usr/local/Cellar/apache-spark/1.6.2/libexec
export PYTHONPATH=$SPARK_HOME/python/pyspark:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
有人知道如何解决这个错误吗?
谢谢
这可能有两个原因:
- 环境变量
SPARK_HOME
可能指向了错误的路径
- 设置
export PYSPARK_SUBMIT_ARGS="--master local[2]"
- 这是您希望 PySpark
开始的配置。
我正在我的 macbook 上用这个例子测试 turi osx 10.10.5 https://turi.com/learn/gallery/notebooks/spark_and_graphlab_create.html
当到达这一步时
# Set up the SparkContext object
# this can be 'local' or 'yarn-client' in PySpark
# Remember if using yarn-client then all the paths should be accessible
# by all nodes in the cluster.
sc = SparkContext('local')
出现以下错误
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
<ipython-input-12-dc1befb4186c> in <module>()
3 # Remember if using yarn-client then all the paths should be accessible
4 # by all nodes in the cluster.
----> 5 sc = SparkContext()
/usr/local/Cellar/apache-spark/1.6.2/libexec/python/pyspark/context.pyc in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
110 """
111 self._callsite = first_spark_call() or CallSite(None, None, None)
--> 112 SparkContext._ensure_initialized(self, gateway=gateway)
113 try:
114 self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
/usr/local/Cellar/apache-spark/1.6.2/libexec/python/pyspark/context.pyc in _ensure_initialized(cls, instance, gateway)
243 with SparkContext._lock:
244 if not SparkContext._gateway:
--> 245 SparkContext._gateway = gateway or launch_gateway()
246 SparkContext._jvm = SparkContext._gateway.jvm
247
/usr/local/Cellar/apache-spark/1.6.2/libexec/python/pyspark/java_gateway.pyc in launch_gateway()
92 callback_socket.close()
93 if gateway_port is None:
---> 94 raise Exception("Java gateway process exited before sending the driver its port number")
95
96 # In Windows, ensure the Java child processes do not linger after Python has exited.
Exception: Java gateway process exited before sending the driver its port number
快速 google 搜索还没有帮助。
这是我的。bash_profile
# added by Anaconda2 4.1.1 installer
export PATH="/Users/me/anaconda/bin:$PATH"
export SCALA_HOME=/usr/local/Cellar/scala/2.11.8/libexec
export SPARK_HOME=/usr/local/Cellar/apache-spark/1.6.2/libexec
export PYTHONPATH=$SPARK_HOME/python/pyspark:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
有人知道如何解决这个错误吗?
谢谢
这可能有两个原因:
- 环境变量
SPARK_HOME
可能指向了错误的路径 - 设置
export PYSPARK_SUBMIT_ARGS="--master local[2]"
- 这是您希望PySpark
开始的配置。