spark start-slave 没有连接到 master
spark start-slave not connecting to master
我正在使用 ubuntu 16 并尝试在我的局域网上设置 spark 集群。
我已经成功地配置了一个 spark master,并成功地从同一台机器连接了一个 slave,并在 localhost:8080
上看到了它
当我尝试从另一台机器连接时,问题开始了,我按照说明配置了无密码 ssh here
当我尝试使用 start-slave.sh spark://master:port 连接到 master 时 here
我收到此错误日志
我尝试使用本地 ip 和本地名称访问主机(我设法使用密码和不使用密码通过 ssh 连接到主机。用户和 root)
我在两者上都尝试了端口 6066 和端口 7077
我没有收到错误消息,但是新的从属没有出现在主控的 localhost:8080 页面中
并不断收到此错误日志
Spark Command: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp
/usr/local/spark/conf/:/usr/local/spark/jars/* -Xmx1g
org.apache.spark.deploy.worker.Worker --webui-port 8081
spark://latitude:6066
======================================== Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 17/07/26 22:09:09
INFO Worker: Started daemon with process name:
20609@name-beckup-laptop 17/07/26 22:09:09 INFO SignalUtils:
Registered signal handler for TERM 17/07/26 22:09:09 INFO SignalUtils:
Registered signal handler for HUP 17/07/26 22:09:09 INFO SignalUtils:
Registered signal handler for INT 17/07/26 22:09:09 WARN Utils: Your
hostname, name-beckup-laptop resolves to a loopback address:
127.0.1.1; using 192.168.14.84 instead (on interface wlp2s0) 17/07/26 22:09:09 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another
address 17/07/26 22:09:09 WARN NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable 17/07/26 22:09:09 INFO SecurityManager: Changing view
acls to: name 17/07/26 22:09:09 INFO SecurityManager: Changing modify
acls to: name 17/07/26 22:09:09 INFO SecurityManager: Changing view
acls groups to: 17/07/26 22:09:09 INFO SecurityManager: Changing
modify acls groups to: 17/07/26 22:09:09 INFO SecurityManager:
SecurityManager: authentication disabled; ui acls disabled; users
with view permissions: Set(name); groups with view permissions: Set();
users with modify permissions: Set(name); groups with modify
permissions: Set() 17/07/26 22:09:09 INFO Utils: Successfully started
service 'sparkWorker' on port 34777. 17/07/26 22:09:09 INFO Worker:
Starting Spark worker 192.168.14.84:34777 with 4 cores, 14.6 GB RAM
17/07/26 22:09:09 INFO Worker: Running Spark version 2.2.0 17/07/26
22:09:09 INFO Worker: Spark home: /usr/local/spark 17/07/26 22:09:10
INFO Utils: Successfully started service 'WorkerUI' on port 8081.
17/07/26 22:09:10 INFO WorkerWebUI: Bound WorkerWebUI to 0.0.0.0, and
started at http://192.168.14.84:8081 17/07/26 22:09:10 INFO Worker:
Connecting to master latitude:6066... 17/07/26 22:09:10 WARN Worker:
Failed to connect to master latitude:6066
org.apache.spark.SparkException: Exception thrown in awaitResult: at
org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at
org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108) at
org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$$anon.run(Worker.scala:241)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748) Caused by:
java.io.IOException: Failed to connect to latitude/192.168.14.83:6066
at
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:232)
at
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:182)
at
org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197)
at org.apache.spark.rpc.netty.Outbox$$anon.call(Outbox.scala:194)
at org.apache.spark.rpc.netty.Outbox$$anon.call(Outbox.scala:190)
... 4 more Caused by:
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection
refused: latitude/192.168.14.83:6066 at
sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at
io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:257)
at
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:291)
at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:631)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442) at
io.netty.util.concurrent.SingleThreadEventExecutor.run(SingleThreadEventExecutor.java:131)
at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
... 1 more
谢谢!
找到问题了!
您需要在 /conf/spark-env
中添加一个文件
添加以下内容:
SPARK_MASTER_IP='<ip of master without port>'
然后是
start-master.sh -h <ip of master>:7077
之后
start-slave.sh spark://<master ip>:7077
会很有魅力。
我有同样的问题,运行ning spark/sbin/start-slave.sh
在主节点上。
hadoop@master:/opt/spark$ sudo ./sbin/start-slave.sh --master spark://master:7077
starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-master.out
failed to launch: nice -n 0 /opt/spark/bin/spark-class org.apache.spark.deploy.worker.Worker --webui-port 8081 --master spark://master:7077
Options:
-c CORES, --cores CORES Number of cores to use
-m MEM, --memory MEM Amount of memory to use (e.g. 1000M, 2G)
-d DIR, --work-dir DIR Directory to run apps in (default: SPARK_HOME/work)
-i HOST, --ip IP Hostname to listen on (deprecated, please use --host or -h)
-h HOST, --host HOST Hostname to listen on
-p PORT, --port PORT Port to listen on (default: random)
--webui-port PORT Port for web UI (default: 8081)
--properties-file FILE Path to a custom Spark properties file.
Default is conf/spark-defaults.conf.
full log in /opt/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-master.out
我发现了我的错,我不应该使用 --master
关键字而只是 运行 命令
hadoop@master:/opt/spark$ sudo ./sbin/start-slave.sh spark://master:7077
按照本教程的步骤操作:
https://phoenixnap.com/kb/install-spark-on-ubuntu
另外,我的/opt/spark/conf/spark-env.sh
配置如下:
SPARK_MASTER_HOST="master"
JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64"
master 是我在 /etc/hosts
中指定的服务器的主机名
我正在使用 ubuntu 16 并尝试在我的局域网上设置 spark 集群。
我已经成功地配置了一个 spark master,并成功地从同一台机器连接了一个 slave,并在 localhost:8080
上看到了它当我尝试从另一台机器连接时,问题开始了,我按照说明配置了无密码 ssh here
当我尝试使用 start-slave.sh spark://master:port 连接到 master 时 here
我收到此错误日志
我尝试使用本地 ip 和本地名称访问主机(我设法使用密码和不使用密码通过 ssh 连接到主机。用户和 root)
我在两者上都尝试了端口 6066 和端口 7077
我没有收到错误消息,但是新的从属没有出现在主控的 localhost:8080 页面中
并不断收到此错误日志
Spark Command: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://latitude:6066 ======================================== Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 17/07/26 22:09:09 INFO Worker: Started daemon with process name: 20609@name-beckup-laptop 17/07/26 22:09:09 INFO SignalUtils: Registered signal handler for TERM 17/07/26 22:09:09 INFO SignalUtils: Registered signal handler for HUP 17/07/26 22:09:09 INFO SignalUtils: Registered signal handler for INT 17/07/26 22:09:09 WARN Utils: Your hostname, name-beckup-laptop resolves to a loopback address: 127.0.1.1; using 192.168.14.84 instead (on interface wlp2s0) 17/07/26 22:09:09 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 17/07/26 22:09:09 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/07/26 22:09:09 INFO SecurityManager: Changing view acls to: name 17/07/26 22:09:09 INFO SecurityManager: Changing modify acls to: name 17/07/26 22:09:09 INFO SecurityManager: Changing view acls groups to: 17/07/26 22:09:09 INFO SecurityManager: Changing modify acls groups to: 17/07/26 22:09:09 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(name); groups with view permissions: Set(); users with modify permissions: Set(name); groups with modify permissions: Set() 17/07/26 22:09:09 INFO Utils: Successfully started service 'sparkWorker' on port 34777. 17/07/26 22:09:09 INFO Worker: Starting Spark worker 192.168.14.84:34777 with 4 cores, 14.6 GB RAM 17/07/26 22:09:09 INFO Worker: Running Spark version 2.2.0 17/07/26 22:09:09 INFO Worker: Spark home: /usr/local/spark 17/07/26 22:09:10 INFO Utils: Successfully started service 'WorkerUI' on port 8081. 17/07/26 22:09:10 INFO WorkerWebUI: Bound WorkerWebUI to 0.0.0.0, and started at http://192.168.14.84:8081 17/07/26 22:09:10 INFO Worker: Connecting to master latitude:6066... 17/07/26 22:09:10 WARN Worker: Failed to connect to master latitude:6066 org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100) at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108) at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$$anon.run(Worker.scala:241) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Failed to connect to latitude/192.168.14.83:6066 at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:232) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:182) at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197) at org.apache.spark.rpc.netty.Outbox$$anon.call(Outbox.scala:194) at org.apache.spark.rpc.netty.Outbox$$anon.call(Outbox.scala:190) ... 4 more Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: latitude/192.168.14.83:6066 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:257) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:291) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:631) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442) at io.netty.util.concurrent.SingleThreadEventExecutor.run(SingleThreadEventExecutor.java:131) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144) ... 1 more
谢谢!
找到问题了!
您需要在 /conf/spark-env
中添加一个文件添加以下内容:
SPARK_MASTER_IP='<ip of master without port>'
然后是
start-master.sh -h <ip of master>:7077
之后
start-slave.sh spark://<master ip>:7077
会很有魅力。
我有同样的问题,运行ning spark/sbin/start-slave.sh
在主节点上。
hadoop@master:/opt/spark$ sudo ./sbin/start-slave.sh --master spark://master:7077
starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-master.out
failed to launch: nice -n 0 /opt/spark/bin/spark-class org.apache.spark.deploy.worker.Worker --webui-port 8081 --master spark://master:7077
Options:
-c CORES, --cores CORES Number of cores to use
-m MEM, --memory MEM Amount of memory to use (e.g. 1000M, 2G)
-d DIR, --work-dir DIR Directory to run apps in (default: SPARK_HOME/work)
-i HOST, --ip IP Hostname to listen on (deprecated, please use --host or -h)
-h HOST, --host HOST Hostname to listen on
-p PORT, --port PORT Port to listen on (default: random)
--webui-port PORT Port for web UI (default: 8081)
--properties-file FILE Path to a custom Spark properties file.
Default is conf/spark-defaults.conf.
full log in /opt/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-master.out
我发现了我的错,我不应该使用 --master
关键字而只是 运行 命令
hadoop@master:/opt/spark$ sudo ./sbin/start-slave.sh spark://master:7077
按照本教程的步骤操作: https://phoenixnap.com/kb/install-spark-on-ubuntu
另外,我的/opt/spark/conf/spark-env.sh
配置如下:
SPARK_MASTER_HOST="master"
JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64"
master 是我在 /etc/hosts