HDFS NFS startup error: “ERROR mount.MountdBase: Failed to start the TCP server...ChannelException: Failed to bind..."
HDFS NFS startup error: “ERROR mount.MountdBase: Failed to start the TCP server...ChannelException: Failed to bind..."
尝试在 docs ( 之后使用/启动 HDFS NFS,但没有启动 hadoop portmap
服务,因为 OS 不是 SLES 11 和 RHEL 6.2 ),但是 运行 在尝试设置启动 hdfs nfs3
服务的 NFS 服务时出错:
[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# service nfs status
Redirecting to /bin/systemctl status nfs.service
Unit nfs.service could not be found.
[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# service nfs stop
Redirecting to /bin/systemctl stop nfs.service
Failed to stop nfs.service: Unit nfs.service not loaded.
[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# service rpcbind status
Redirecting to /bin/systemctl status rpcbind.service
● rpcbind.service - RPC bind service
Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2019-07-23 13:48:54 HST; 28s ago
Process: 27337 ExecStart=/sbin/rpcbind -w $RPCBIND_ARGS (code=exited, status=0/SUCCESS)
Main PID: 27338 (rpcbind)
CGroup: /system.slice/rpcbind.service
└─27338 /sbin/rpcbind -w
Jul 23 13:48:54 HW02.ucera.local systemd[1]: Starting RPC bind service...
Jul 23 13:48:54 HW02.ucera.local systemd[1]: Started RPC bind service.
[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# hdfs nfs3
19/07/23 13:49:33 INFO nfs3.Nfs3Base: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting Nfs3
STARTUP_MSG: host = HW02.ucera.local/172.18.4.47
STARTUP_MSG: args = []
STARTUP_MSG: version = 3.1.1.3.1.0.0-78
STARTUP_MSG: classpath = /usr/hdp/3.1.0.0-78/hadoop/conf:/usr/hdp/3.1.0.0-78/hadoop/lib/jersey-server-1.19.jar:/usr/hdp/3.1.0.0-78/hadoop/lib/ranger-hdfs-plugin-shim-1.2.0.3.1.0.0-78.jar:
...
<a bunch of other jars>
...
STARTUP_MSG: build = git@github.com:hortonworks/hadoop.git -r e4f82af51faec922b4804d0232a637422ec29e64; compiled by 'jenkins' on 2018-12-06T12:26Z
STARTUP_MSG: java = 1.8.0_112
************************************************************/
19/07/23 13:49:33 INFO nfs3.Nfs3Base: registered UNIX signal handlers for [TERM, HUP, INT]
19/07/23 13:49:33 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties
19/07/23 13:49:33 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
19/07/23 13:49:33 INFO impl.MetricsSystemImpl: Nfs3 metrics system started
19/07/23 13:49:33 INFO oncrpc.RpcProgram: Will accept client connections from unprivileged ports
19/07/23 13:49:33 INFO security.ShellBasedIdMapping: Not doing static UID/GID mapping because '/etc/nfs.map' does not exist.
19/07/23 13:49:33 INFO nfs3.WriteManager: Stream timeout is 600000ms.
19/07/23 13:49:33 INFO nfs3.WriteManager: Maximum open streams is 256
19/07/23 13:49:33 INFO nfs3.OpenFileCtxCache: Maximum open streams is 256
19/07/23 13:49:34 INFO nfs3.DFSClientCache: Added export: / FileSystem URI: / with namenodeId: -1408097406
19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Configured HDFS superuser is
19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Delete current dump directory /tmp/.hdfs-nfs
19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Create new dump directory /tmp/.hdfs-nfs
19/07/23 13:49:34 INFO nfs3.Nfs3Base: NFS server port set to: 2049
19/07/23 13:49:34 INFO oncrpc.RpcProgram: Will accept client connections from unprivileged ports
19/07/23 13:49:34 INFO mount.RpcProgramMountd: FS:hdfs adding export Path:/ with URI: hdfs://hw01.ucera.local:8020/
19/07/23 13:49:34 INFO oncrpc.SimpleUdpServer: Started listening to UDP requests at port 4242 for Rpc program: mountd at localhost:4242 with workerCount 1
19/07/23 13:49:34 ERROR mount.MountdBase: Failed to start the TCP server.
org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:4242
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at org.apache.hadoop.oncrpc.SimpleTcpServer.run(SimpleTcpServer.java:89)
at org.apache.hadoop.mount.MountdBase.startTCPServer(MountdBase.java:83)
at org.apache.hadoop.mount.MountdBase.start(MountdBase.java:98)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startServiceInternal(Nfs3.java:56)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startService(Nfs3.java:69)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:79)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
...
...
19/07/23 13:49:34 INFO util.ExitUtil: Exiting with status 1: org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:4242
19/07/23 13:49:34 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down Nfs3 at HW02.ucera.local/172.18.4.47
************************************************************/
不确定如何解释此处看到的任何错误(并且没有安装任何包,如 nfs-utils
,假设 Ambari 会在最初安装集群时安装所有需要的包)。
对此有何调试建议或解决方案?
** 更新:
查看错误后,我可以看到
Caused by: java.net.BindException: Address already in use
看看已经在使用它的是什么,我们看到...
[root@HW02 ~]# netstat -ltnp | grep 4242
tcp 0 0 0.0.0.0:4242 0.0.0.0:* LISTEN 98067/jsvc.exec
进程 jsvc.exec 似乎与 running java applications 有关。鉴于 hadoop 在 java 上运行,我认为仅仅终止进程是不好的。它不应该在这个端口上(因为会干扰 NFS 网关)?不知道该怎么办。
TLDR:nfs 网关服务已经 运行ning(显然是默认情况下),我认为该服务正在阻止 hadoop nfs3
服务( jsvc.exec
) 从一开始就是(我假设)该服务的一部分已经 运行ning.
让我怀疑这是在关闭集群时,服务也停止了,而且它正在使用我需要的 nfs 端口。我确认这一点的方式只是按照 docs 中的验证步骤,看到我的输出与预期的相似。
[root@HW02 ~]# rpcinfo -p hw02
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100005 1 udp 4242 mountd
100005 2 udp 4242 mountd
100005 3 udp 4242 mountd
100005 1 tcp 4242 mountd
100005 2 tcp 4242 mountd
100005 3 tcp 4242 mountd
100003 3 tcp 2049 nfs
[root@HW02 ~]# showmount -e hw02
Export list for hw02:
/ *
另一件事可以告诉我 jsvc 进程是已经 运行ning hdfs nfs 服务的一部分,它会检查进程信息...
[root@HW02 ~]# ps -feww | grep jsvc
root 61106 59083 0 14:27 pts/2 00:00:00 grep --color=auto jsvc
root 163179 1 0 12:14 ? 00:00:00 jsvc.exec -Dproc_nfs3 -outfile /var/log/hadoop/root/hadoop-hdfs-root-nfs3-HW02.ucera.local.out -errfile /var/log/hadoop/root/privileged-root-nfs3-HW02.ucera.local.err -pidfile /var/run/hadoop/root/hadoop-hdfs-root-nfs3.pid -nodetach -user hdfs -cp /usr/hdp/3.1.0.0-78/hadoop/conf:...
...
hdfs 163193 163179 0 12:14 ? 00:00:17 jsvc.exec -Dproc_nfs3 -outfile /var/log/hadoop/root/hadoop-hdfs-root-nfs3-HW02.ucera.local.out -errfile /var/log/hadoop/root/privileged-root-nfs3-HW02.ucera.local.err -pidfile /var/run/hadoop/root/hadoop-hdfs-root-nfs3.pid -nodetach -user hdfs -cp /usr/hdp/3.1.0.0-78/hadoop/conf:...
并看到 jsvc.exec -Dproc_nfs3 ...
得到提示 jsvc
(显然是 running java apps on linux)被用于 运行 我正在尝试的 nfs3 服务开始。
对于遇到此问题的任何其他人,请注意我没有停止文档要您停止的所有服务(因为使用的是 centos7)
[root@HW01 /]# service nfs status
Redirecting to /bin/systemctl status nfs.service
● nfs-server.service - NFS server and services
Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; disabled; vendor preset: disabled)
Active: inactive (dead)
[root@HW01 /]# service rpcbind status
Redirecting to /bin/systemctl status rpcbind.service
● rpcbind.service - RPC bind service
Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2019-07-19 15:17:02 HST; 6 days ago
Main PID: 2155 (rpcbind)
CGroup: /system.slice/rpcbind.service
└─2155 /sbin/rpcbind -w
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
另请注意,我没有遵循docs中推荐的任何配置文件设置(并且文档中指示的某些属性甚至不能可以在 Ambari 管理的 HDFS 配置中找到(所以如果有人能解释为什么这仍然对我有用,请解释)。
** 更新:
在与一些使用 HDP (v3.1) 比我更有经验的人交谈后,我链接到的用于为 HDFS 设置 NFS 的文档可能不是完全最新的(当通过 Ambari mgnt 设置 NFS 时. 无论如何)...
可以通过在 Ambari 主机管理中将其作为 NFS 节点选中来让集群节点充当 NFS 网关 UI:
所需的配置可以在 HDFS mgnt 中像这样设置。 UI...
可以通过查看 Ambari 中的 Host > Summary > Components 部分确认 HDFS NFS 网关正在 运行ning...
尝试在 docs (hadoop portmap
服务,因为 OS 不是 SLES 11 和 RHEL 6.2hdfs nfs3
服务的 NFS 服务时出错:
[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# service nfs status
Redirecting to /bin/systemctl status nfs.service
Unit nfs.service could not be found.
[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# service nfs stop
Redirecting to /bin/systemctl stop nfs.service
Failed to stop nfs.service: Unit nfs.service not loaded.
[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# service rpcbind status
Redirecting to /bin/systemctl status rpcbind.service
● rpcbind.service - RPC bind service
Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2019-07-23 13:48:54 HST; 28s ago
Process: 27337 ExecStart=/sbin/rpcbind -w $RPCBIND_ARGS (code=exited, status=0/SUCCESS)
Main PID: 27338 (rpcbind)
CGroup: /system.slice/rpcbind.service
└─27338 /sbin/rpcbind -w
Jul 23 13:48:54 HW02.ucera.local systemd[1]: Starting RPC bind service...
Jul 23 13:48:54 HW02.ucera.local systemd[1]: Started RPC bind service.
[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# hdfs nfs3
19/07/23 13:49:33 INFO nfs3.Nfs3Base: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting Nfs3
STARTUP_MSG: host = HW02.ucera.local/172.18.4.47
STARTUP_MSG: args = []
STARTUP_MSG: version = 3.1.1.3.1.0.0-78
STARTUP_MSG: classpath = /usr/hdp/3.1.0.0-78/hadoop/conf:/usr/hdp/3.1.0.0-78/hadoop/lib/jersey-server-1.19.jar:/usr/hdp/3.1.0.0-78/hadoop/lib/ranger-hdfs-plugin-shim-1.2.0.3.1.0.0-78.jar:
...
<a bunch of other jars>
...
STARTUP_MSG: build = git@github.com:hortonworks/hadoop.git -r e4f82af51faec922b4804d0232a637422ec29e64; compiled by 'jenkins' on 2018-12-06T12:26Z
STARTUP_MSG: java = 1.8.0_112
************************************************************/
19/07/23 13:49:33 INFO nfs3.Nfs3Base: registered UNIX signal handlers for [TERM, HUP, INT]
19/07/23 13:49:33 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties
19/07/23 13:49:33 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
19/07/23 13:49:33 INFO impl.MetricsSystemImpl: Nfs3 metrics system started
19/07/23 13:49:33 INFO oncrpc.RpcProgram: Will accept client connections from unprivileged ports
19/07/23 13:49:33 INFO security.ShellBasedIdMapping: Not doing static UID/GID mapping because '/etc/nfs.map' does not exist.
19/07/23 13:49:33 INFO nfs3.WriteManager: Stream timeout is 600000ms.
19/07/23 13:49:33 INFO nfs3.WriteManager: Maximum open streams is 256
19/07/23 13:49:33 INFO nfs3.OpenFileCtxCache: Maximum open streams is 256
19/07/23 13:49:34 INFO nfs3.DFSClientCache: Added export: / FileSystem URI: / with namenodeId: -1408097406
19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Configured HDFS superuser is
19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Delete current dump directory /tmp/.hdfs-nfs
19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Create new dump directory /tmp/.hdfs-nfs
19/07/23 13:49:34 INFO nfs3.Nfs3Base: NFS server port set to: 2049
19/07/23 13:49:34 INFO oncrpc.RpcProgram: Will accept client connections from unprivileged ports
19/07/23 13:49:34 INFO mount.RpcProgramMountd: FS:hdfs adding export Path:/ with URI: hdfs://hw01.ucera.local:8020/
19/07/23 13:49:34 INFO oncrpc.SimpleUdpServer: Started listening to UDP requests at port 4242 for Rpc program: mountd at localhost:4242 with workerCount 1
19/07/23 13:49:34 ERROR mount.MountdBase: Failed to start the TCP server.
org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:4242
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at org.apache.hadoop.oncrpc.SimpleTcpServer.run(SimpleTcpServer.java:89)
at org.apache.hadoop.mount.MountdBase.startTCPServer(MountdBase.java:83)
at org.apache.hadoop.mount.MountdBase.start(MountdBase.java:98)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startServiceInternal(Nfs3.java:56)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startService(Nfs3.java:69)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:79)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
...
...
19/07/23 13:49:34 INFO util.ExitUtil: Exiting with status 1: org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:4242
19/07/23 13:49:34 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down Nfs3 at HW02.ucera.local/172.18.4.47
************************************************************/
不确定如何解释此处看到的任何错误(并且没有安装任何包,如 nfs-utils
,假设 Ambari 会在最初安装集群时安装所有需要的包)。
对此有何调试建议或解决方案?
** 更新: 查看错误后,我可以看到
Caused by: java.net.BindException: Address already in use
看看已经在使用它的是什么,我们看到...
[root@HW02 ~]# netstat -ltnp | grep 4242
tcp 0 0 0.0.0.0:4242 0.0.0.0:* LISTEN 98067/jsvc.exec
进程 jsvc.exec 似乎与 running java applications 有关。鉴于 hadoop 在 java 上运行,我认为仅仅终止进程是不好的。它不应该在这个端口上(因为会干扰 NFS 网关)?不知道该怎么办。
TLDR:nfs 网关服务已经 运行ning(显然是默认情况下),我认为该服务正在阻止 hadoop nfs3
服务( jsvc.exec
) 从一开始就是(我假设)该服务的一部分已经 运行ning.
让我怀疑这是在关闭集群时,服务也停止了,而且它正在使用我需要的 nfs 端口。我确认这一点的方式只是按照 docs 中的验证步骤,看到我的输出与预期的相似。
[root@HW02 ~]# rpcinfo -p hw02
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100005 1 udp 4242 mountd
100005 2 udp 4242 mountd
100005 3 udp 4242 mountd
100005 1 tcp 4242 mountd
100005 2 tcp 4242 mountd
100005 3 tcp 4242 mountd
100003 3 tcp 2049 nfs
[root@HW02 ~]# showmount -e hw02
Export list for hw02:
/ *
另一件事可以告诉我 jsvc 进程是已经 运行ning hdfs nfs 服务的一部分,它会检查进程信息...
[root@HW02 ~]# ps -feww | grep jsvc
root 61106 59083 0 14:27 pts/2 00:00:00 grep --color=auto jsvc
root 163179 1 0 12:14 ? 00:00:00 jsvc.exec -Dproc_nfs3 -outfile /var/log/hadoop/root/hadoop-hdfs-root-nfs3-HW02.ucera.local.out -errfile /var/log/hadoop/root/privileged-root-nfs3-HW02.ucera.local.err -pidfile /var/run/hadoop/root/hadoop-hdfs-root-nfs3.pid -nodetach -user hdfs -cp /usr/hdp/3.1.0.0-78/hadoop/conf:...
...
hdfs 163193 163179 0 12:14 ? 00:00:17 jsvc.exec -Dproc_nfs3 -outfile /var/log/hadoop/root/hadoop-hdfs-root-nfs3-HW02.ucera.local.out -errfile /var/log/hadoop/root/privileged-root-nfs3-HW02.ucera.local.err -pidfile /var/run/hadoop/root/hadoop-hdfs-root-nfs3.pid -nodetach -user hdfs -cp /usr/hdp/3.1.0.0-78/hadoop/conf:...
并看到 jsvc.exec -Dproc_nfs3 ...
得到提示 jsvc
(显然是 running java apps on linux)被用于 运行 我正在尝试的 nfs3 服务开始。
对于遇到此问题的任何其他人,请注意我没有停止文档要您停止的所有服务(因为使用的是 centos7)
[root@HW01 /]# service nfs status
Redirecting to /bin/systemctl status nfs.service
● nfs-server.service - NFS server and services
Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; disabled; vendor preset: disabled)
Active: inactive (dead)
[root@HW01 /]# service rpcbind status
Redirecting to /bin/systemctl status rpcbind.service
● rpcbind.service - RPC bind service
Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2019-07-19 15:17:02 HST; 6 days ago
Main PID: 2155 (rpcbind)
CGroup: /system.slice/rpcbind.service
└─2155 /sbin/rpcbind -w
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
另请注意,我没有遵循docs中推荐的任何配置文件设置(并且文档中指示的某些属性甚至不能可以在 Ambari 管理的 HDFS 配置中找到(所以如果有人能解释为什么这仍然对我有用,请解释)。
** 更新:
在与一些使用 HDP (v3.1) 比我更有经验的人交谈后,我链接到的用于为 HDFS 设置 NFS 的文档可能不是完全最新的(当通过 Ambari mgnt 设置 NFS 时. 无论如何)...
可以通过在 Ambari 主机管理中将其作为 NFS 节点选中来让集群节点充当 NFS 网关 UI:
所需的配置可以在 HDFS mgnt 中像这样设置。 UI...
可以通过查看 Ambari 中的 Host > Summary > Components 部分确认 HDFS NFS 网关正在 运行ning...