yarn 执行器启动了错误版本的 spark
yarn executor launch wrong version of spark
我有一个安装了 hadoop 2.6.3 和 spark 1.6 的集群。
最近,我将 spark 升级到 2.0,并且一切正常,直到我尝试 运行 一些旧的 spark 1.6 工作,它与 spark 2.0 有一些兼容问题。
我首先尝试的是:
echo $SPARK_HOME
/usr/local/spark-1.6.1-bin-hadoop2.6
/usr/local/spark-1.6.1-bin-hadoop2.6/bin/spark-submit --master yarn--deploy-mode client /usr/local/spark-1.6.1-bin-hadoop2.6/examples/src/main/python/pi.py 100
但是,上述作业失败了,当我检查 yarn 日志时,我发现了以下内容:
YARN executor launch context:
env:
CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/*<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/lib/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
SPARK_LOG_URL_STDERR -> http://datanode01-bi-dev:8042/node/containerlogs/container_1476081972773_0194_01_000003/hadoop/stderr?start=-4096
SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1476081972773_0194
SPARK_YARN_CACHE_FILES_FILE_SIZES -> 187698038,357051,44846
SPARK_USER -> hadoop
SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PRIVATE,PRIVATE
SPARK_YARN_MODE -> true
SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1477040367079,1477040367425,1477040367454
SPARK_HOME -> /usr/local/spark-2.0.0-bin-hadoop2.6
PYTHONPATH -> /usr/local/spark-1.6.1-bin-hadoop2.6/python/lib/py4j-0.9-src.zip:<CPS>{{PWD}}/pyspark.zip<CPS>{{PWD}}/py4j-0.9-src.zip
SPARK_LOG_URL_STDOUT -> http://datanode01-bi-dev:8042/node/containerlogs/container_1476081972773_0194_01_000003/hadoop/stdout?start=-4096
SPARK_YARN_CACHE_FILES -> hdfs://10.104.90.40:8020/user/hadoop/.sparkStaging/application_1476081972773_0194/spark-assembly-1.6.1-hadoop2.6.0.jar#__spark__.jar,hdfs://10.104.90.40:8020/user/hadoop/.sparkStaging/application_1476081972773_0194/pyspark.zip#pyspark.zip,hdfs://10.104.90.40:8020/user/hadoop/.sparkStaging/application_1476081972773_0194/py4j-0.9-src.zip#py4j-0.9-src.zip
command:
{{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms1024m -Xmx1024m -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=26087' '-Dspark.ui.port=0' -Dspark.yarn.app.container.log.dir=<LOG_DIR> -XX:MaxPermSize=256m org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@10.104.90.41:26087 --executor-id 2 --hostname datanode01-bi-dev --cores 1 --app-id application_1476081972773_0194 --user-class-path file:$PWD/__app__.jar 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr
.......
.......
Traceback (most recent call last):
File "pi.py", line 39, in <module>
count = sc.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
File "/usr/local/spark-2.0.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 802, in reduce
File "/usr/local/spark-2.0.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 776, in collect
File "/usr/local/spark-2.0.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 2403, in _jrdd
File "/usr/local/spark-2.0.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 2338, in _wrap_function
TypeError: 'JavaPackage' object is not callable
很明显,yarn 使用 Spark 2.0 启动了 executor,导致作业失败。
我把所有能想到的关于spark环境设置的地方都找遍了,找不到spark 2.0。
在 ~/.bashrc 中,我有:
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH
export SPARK_HOME=/usr/local/spark-2.0.0-bin-hadoop2.6
以下命令给出了空结果:
grep -rnw /usr/local/spark-1.6.1-bin-hadoop2.6 -e spark-2.0.0-bin-hadoop2.6
grep -rnw /usr/local/hadoop-2.6.3 -e spark-2.0.0-bin-hadoop2.6
我在 namenode 和 datanode 上都尝试了上述场景,结果相同。
但是,java Pi 示例可以 运行 成功。
spark-submit --master yarn --deploy-mode cluster --class org.apache.spark.examples.SparkPi /usr/local/spark-1.6.1-bin-hadoop2.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar 100
谁能分享为什么 yarn 加载了错误版本的 spark?
更新:
问题其实是我的PATH被搞乱了。所以在我清理路径并将spark 2.0设置为spark submit的默认版本之后。现在一切正常。
首先注释掉您的 .bashrc
导出并将它们从环境中删除 - 它们是不兼容的。 PYTHONPATH
使用 spark 1.6 库,SPARK_HOME
指向 spark 2.0。
然后 运行 示例通过在两个版本上使用 spark-submit
的绝对路径 - spark-submit
根据其位置设置 SPARK_HOME
,因此它应该在两个版本中都有效。
我有一个安装了 hadoop 2.6.3 和 spark 1.6 的集群。
最近,我将 spark 升级到 2.0,并且一切正常,直到我尝试 运行 一些旧的 spark 1.6 工作,它与 spark 2.0 有一些兼容问题。
我首先尝试的是:
echo $SPARK_HOME
/usr/local/spark-1.6.1-bin-hadoop2.6
/usr/local/spark-1.6.1-bin-hadoop2.6/bin/spark-submit --master yarn--deploy-mode client /usr/local/spark-1.6.1-bin-hadoop2.6/examples/src/main/python/pi.py 100
但是,上述作业失败了,当我检查 yarn 日志时,我发现了以下内容:
YARN executor launch context:
env:
CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/*<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/lib/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
SPARK_LOG_URL_STDERR -> http://datanode01-bi-dev:8042/node/containerlogs/container_1476081972773_0194_01_000003/hadoop/stderr?start=-4096
SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1476081972773_0194
SPARK_YARN_CACHE_FILES_FILE_SIZES -> 187698038,357051,44846
SPARK_USER -> hadoop
SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PRIVATE,PRIVATE
SPARK_YARN_MODE -> true
SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1477040367079,1477040367425,1477040367454
SPARK_HOME -> /usr/local/spark-2.0.0-bin-hadoop2.6
PYTHONPATH -> /usr/local/spark-1.6.1-bin-hadoop2.6/python/lib/py4j-0.9-src.zip:<CPS>{{PWD}}/pyspark.zip<CPS>{{PWD}}/py4j-0.9-src.zip
SPARK_LOG_URL_STDOUT -> http://datanode01-bi-dev:8042/node/containerlogs/container_1476081972773_0194_01_000003/hadoop/stdout?start=-4096
SPARK_YARN_CACHE_FILES -> hdfs://10.104.90.40:8020/user/hadoop/.sparkStaging/application_1476081972773_0194/spark-assembly-1.6.1-hadoop2.6.0.jar#__spark__.jar,hdfs://10.104.90.40:8020/user/hadoop/.sparkStaging/application_1476081972773_0194/pyspark.zip#pyspark.zip,hdfs://10.104.90.40:8020/user/hadoop/.sparkStaging/application_1476081972773_0194/py4j-0.9-src.zip#py4j-0.9-src.zip
command:
{{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms1024m -Xmx1024m -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=26087' '-Dspark.ui.port=0' -Dspark.yarn.app.container.log.dir=<LOG_DIR> -XX:MaxPermSize=256m org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@10.104.90.41:26087 --executor-id 2 --hostname datanode01-bi-dev --cores 1 --app-id application_1476081972773_0194 --user-class-path file:$PWD/__app__.jar 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr
.......
.......
Traceback (most recent call last):
File "pi.py", line 39, in <module>
count = sc.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
File "/usr/local/spark-2.0.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 802, in reduce
File "/usr/local/spark-2.0.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 776, in collect
File "/usr/local/spark-2.0.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 2403, in _jrdd
File "/usr/local/spark-2.0.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 2338, in _wrap_function
TypeError: 'JavaPackage' object is not callable
很明显,yarn 使用 Spark 2.0 启动了 executor,导致作业失败。
我把所有能想到的关于spark环境设置的地方都找遍了,找不到spark 2.0。
在 ~/.bashrc 中,我有:
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH
export SPARK_HOME=/usr/local/spark-2.0.0-bin-hadoop2.6
以下命令给出了空结果:
grep -rnw /usr/local/spark-1.6.1-bin-hadoop2.6 -e spark-2.0.0-bin-hadoop2.6
grep -rnw /usr/local/hadoop-2.6.3 -e spark-2.0.0-bin-hadoop2.6
我在 namenode 和 datanode 上都尝试了上述场景,结果相同。
但是,java Pi 示例可以 运行 成功。
spark-submit --master yarn --deploy-mode cluster --class org.apache.spark.examples.SparkPi /usr/local/spark-1.6.1-bin-hadoop2.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar 100
谁能分享为什么 yarn 加载了错误版本的 spark?
更新:
问题其实是我的PATH被搞乱了。所以在我清理路径并将spark 2.0设置为spark submit的默认版本之后。现在一切正常。
首先注释掉您的 .bashrc
导出并将它们从环境中删除 - 它们是不兼容的。 PYTHONPATH
使用 spark 1.6 库,SPARK_HOME
指向 spark 2.0。
然后 运行 示例通过在两个版本上使用 spark-submit
的绝对路径 - spark-submit
根据其位置设置 SPARK_HOME
,因此它应该在两个版本中都有效。