nodemanager连接到resourcemanager,但datanode没有连接到namenode
nodemanager is connected to resourcemanager, but datanode is not connected to namenode
我在 CentOS 7.0 上安装了 hadoop 2.5.1。
(1) 当我运行 app over hadoop 时,我怀疑下面的消息路径“/tmp/hadoop-yarn/staging/hadoop/.staging/job_1424775783787_0001/files”是由于兼容性导致的。
如果是兼容性问题,那我该如何修补呢??
15/02/24 20:27:41 ERROR streaming.StreamJob: Error Launching job :
File /tmp/hadoop-yarn/staging/hadoop/.staging/job_1424775783787_0001/files/Formatter.sh could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
(2) 端口master:9000被监听,但是datanode没有连接到master,
但是节点管理器还活着。
所以我可以从 8088 端口检查活动数据节点
但是它没有检查 50070 端口
配置如下。
主机文件
XXX.XXX.XXX.65 mccb-com65 #server
XXX.XXX.XXX.66 mccb-com66 #client01
XXX.XXX.XXX.67 mccb-com67 #client02
127.0.1.1 mccb-com65 (mccb-com66, mccb-com67 per computer setting)
127.0.0.1 localhost
核心-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://XXX.XXX.XXX.65:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hdfs/hdfs/namenode</value>
<description>the path which save the file system image </description>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hdfs/hdfs/datanode</value>
<description>the path which the datanode save the block</description>
</property>
<property>
<name>dfs.http.address</name>
<value>0.0.0.0:50070</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>0.0.0.0:50075</value>
</property>
<property>
<name>dfs.datanode.ipc.address</name>
<value>0.0.0.0:50020</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx400m</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>0.0.0.0:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>0.0.0.0:19888</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/home/hadoop/hdfs/hdfs/mapred/system</value>
<final>true</final>
</property>
<property>
<name>mapred.local.dir</name>
<value>/home/hadoop/hdfs/hdfs/mapred/local</value>
<final>true</final>
</property>
</configuration>
纱-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>XXX.XXX.XXX.65:8031</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>XXX.XXX.XXX.65:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>XXX.XXX.XXX.65:8032</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resoucemanager.webapp.address </name>
<value>XXX.XXX.XXX.65:8088</value>
</property>
<property>
<name>yarn.nodemanager.webapp.address </name>
<value>0.0.0.0:8042</value>
</property>
</configuration>
[root@mccb-com65 ~]# netstat -antlp
Active Internet connections (servers and established)
Proto Recv-Q
Send-Q Local Address Foreign Address State
PID/Program name tcp 0 0 127.0.0.1:25
0.0.0.0:* LISTEN 2868/master tcp 0 0 0.0.0.0:3389 0.0.0.0:* LISTEN
1746/xrdp tcp 0 0 XXX.XXX.XXX.65:8030
0.0.0.0:* LISTEN 10858/java tcp 0 0 XXX.XXX.XXX.65:8031 0.0.0.0:* LISTEN
10858/java tcp 0 0 XXX.XXX.XXX.65:8032
0.0.0.0:* LISTEN 10858/java tcp 0 0 0.0.0.0:8033 0.0.0.0:* LISTEN
10858/java tcp 0 0 0.0.0.0:50885
0.0.0.0:* LISTEN 2282/rpc.statd tcp 0 0 XXX.XXX.XXX.65:9000 0.0.0.0:* LISTEN
10470/java tcp 0 0 0.0.0.0:50090
0.0.0.0:* LISTEN 10684/java tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN
1753/rpcbind tcp 0 0 0.0.0.0:50070
0.0.0.0:* LISTEN 10470/java tcp 0 0 127.0.0.1:3350 0.0.0.0:* LISTEN
1745/xrdp-sesman tcp 0 0 0.0.0.0:22
0.0.0.0:* LISTEN 1761/sshd tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN
3278/cupsd tcp 0 0 127.0.0.1:5911
0.0.0.0:* LISTEN 3053/Xvnc tcp 0 0 0.0.0.0:8088 0.0.0.0:* LISTEN
10858/java tcp 0 0 XXX.XXX.XXX.65:8031
XXX.XXX.XXX.67:44914 ESTABLISHED 10858/java tcp 0 0 XXX.XXX.XXX.65:42505 XXX.XXX.XXX.65:9000 TIME_WAIT -
tcp 0 0 127.0.0.1:5911 127.0.0.1:50271
ESTABLISHED 3053/Xvnc tcp 0 0
XXX.XXX.XXX.65:3389 XXX.XXX.XXX.96:52951 ESTABLISHED 1746/xrdp tcp 0 0 127.0.0.1:50271
127.0.0.1:5911 ESTABLISHED 1746/xrdp tcp 0 0 XXX.XXX.XXX.65:8031 XXX.XXX.XXX.66:46816 ESTABLISHED
10858/java tcp6 0 0 :::44331 :::*
LISTEN 2282/rpc.statd tcp6 0 0 :::111
:::* LISTEN 1753/rpcbind tcp6 0
0 :::22 :::* LISTEN
1761/sshd
以下步骤应该可以解决这个问题。 但是,您可能会丢失数据。
停止 Hadoop:
- sbin/stop-dfs.sh
- sbin/stop.yarn.sh
删除NameNode和DataNode:
- hdfs dfs -rm -r /hdfs/namenode
- hdfs dfs -rm -r /hdfs/datanodenode
格式化NameNode
- hdfs 名称节点格式
启动NameNode、Datanodes和YARN
- sbin/start-dfs.sh
- sbin/start-yarn.sh
我在 CentOS 7.0 上安装了 hadoop 2.5.1。
(1) 当我运行 app over hadoop 时,我怀疑下面的消息路径“/tmp/hadoop-yarn/staging/hadoop/.staging/job_1424775783787_0001/files”是由于兼容性导致的。 如果是兼容性问题,那我该如何修补呢??
15/02/24 20:27:41 ERROR streaming.StreamJob: Error Launching job : File /tmp/hadoop-yarn/staging/hadoop/.staging/job_1424775783787_0001/files/Formatter.sh could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
(2) 端口master:9000被监听,但是datanode没有连接到master, 但是节点管理器还活着。 所以我可以从 8088 端口检查活动数据节点 但是它没有检查 50070 端口
配置如下。
主机文件
XXX.XXX.XXX.65 mccb-com65 #server
XXX.XXX.XXX.66 mccb-com66 #client01
XXX.XXX.XXX.67 mccb-com67 #client02
127.0.1.1 mccb-com65 (mccb-com66, mccb-com67 per computer setting)
127.0.0.1 localhost
核心-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://XXX.XXX.XXX.65:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hdfs/hdfs/namenode</value>
<description>the path which save the file system image </description>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hdfs/hdfs/datanode</value>
<description>the path which the datanode save the block</description>
</property>
<property>
<name>dfs.http.address</name>
<value>0.0.0.0:50070</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>0.0.0.0:50075</value>
</property>
<property>
<name>dfs.datanode.ipc.address</name>
<value>0.0.0.0:50020</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx400m</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>0.0.0.0:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>0.0.0.0:19888</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/home/hadoop/hdfs/hdfs/mapred/system</value>
<final>true</final>
</property>
<property>
<name>mapred.local.dir</name>
<value>/home/hadoop/hdfs/hdfs/mapred/local</value>
<final>true</final>
</property>
</configuration>
纱-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>XXX.XXX.XXX.65:8031</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>XXX.XXX.XXX.65:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>XXX.XXX.XXX.65:8032</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resoucemanager.webapp.address </name>
<value>XXX.XXX.XXX.65:8088</value>
</property>
<property>
<name>yarn.nodemanager.webapp.address </name>
<value>0.0.0.0:8042</value>
</property>
</configuration>
[root@mccb-com65 ~]# netstat -antlp
Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State
PID/Program name tcp 0 0 127.0.0.1:25
0.0.0.0:* LISTEN 2868/master tcp 0 0 0.0.0.0:3389 0.0.0.0:* LISTEN
1746/xrdp tcp 0 0 XXX.XXX.XXX.65:8030
0.0.0.0:* LISTEN 10858/java tcp 0 0 XXX.XXX.XXX.65:8031 0.0.0.0:* LISTEN
10858/java tcp 0 0 XXX.XXX.XXX.65:8032
0.0.0.0:* LISTEN 10858/java tcp 0 0 0.0.0.0:8033 0.0.0.0:* LISTEN
10858/java tcp 0 0 0.0.0.0:50885
0.0.0.0:* LISTEN 2282/rpc.statd tcp 0 0 XXX.XXX.XXX.65:9000 0.0.0.0:* LISTEN
10470/java tcp 0 0 0.0.0.0:50090
0.0.0.0:* LISTEN 10684/java tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN
1753/rpcbind tcp 0 0 0.0.0.0:50070
0.0.0.0:* LISTEN 10470/java tcp 0 0 127.0.0.1:3350 0.0.0.0:* LISTEN
1745/xrdp-sesman tcp 0 0 0.0.0.0:22
0.0.0.0:* LISTEN 1761/sshd tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN
3278/cupsd tcp 0 0 127.0.0.1:5911
0.0.0.0:* LISTEN 3053/Xvnc tcp 0 0 0.0.0.0:8088 0.0.0.0:* LISTEN
10858/java tcp 0 0 XXX.XXX.XXX.65:8031
XXX.XXX.XXX.67:44914 ESTABLISHED 10858/java tcp 0 0 XXX.XXX.XXX.65:42505 XXX.XXX.XXX.65:9000 TIME_WAIT - tcp 0 0 127.0.0.1:5911 127.0.0.1:50271
ESTABLISHED 3053/Xvnc tcp 0 0 XXX.XXX.XXX.65:3389 XXX.XXX.XXX.96:52951 ESTABLISHED 1746/xrdp tcp 0 0 127.0.0.1:50271
127.0.0.1:5911 ESTABLISHED 1746/xrdp tcp 0 0 XXX.XXX.XXX.65:8031 XXX.XXX.XXX.66:46816 ESTABLISHED 10858/java tcp6 0 0 :::44331 :::* LISTEN 2282/rpc.statd tcp6 0 0 :::111
:::* LISTEN 1753/rpcbind tcp6 0 0 :::22 :::* LISTEN
1761/sshd
以下步骤应该可以解决这个问题。 但是,您可能会丢失数据。
停止 Hadoop:
- sbin/stop-dfs.sh
- sbin/stop.yarn.sh
删除NameNode和DataNode:
- hdfs dfs -rm -r /hdfs/namenode
- hdfs dfs -rm -r /hdfs/datanodenode
格式化NameNode
- hdfs 名称节点格式
启动NameNode、Datanodes和YARN
- sbin/start-dfs.sh
- sbin/start-yarn.sh