HDFS (WebHDFS) PUT/CREATE 失败 - getaddrinfo 失败
HDFS (WebHDFS) PUT/CREATE FAILS - getaddrinfo failed
当使用 Python hdfs
库提交请求时,我收到以下失败消息。
Traceback (most recent call last):
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 160, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\util\connection.py", line 57, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\socket.py", line 748, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "hdfs_test.py", line 128, in <module>
sys.exit(main(sys.argv))
File "hdfs_test.py", line 108, in main
hdfs_stream.write(raw_bytes)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\hdfs\util.py", line 104, in __exit__
raise self._err # pylint: disable=raising-bad-type
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\hdfs\util.py", line 76, in consumer
self._consumer(data)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\hdfs\client.py", line 469, in consumer
data=(c.encode(encoding) for c in _data) if encoding else _data,
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\hdfs\client.py", line 214, in _request
**kwargs
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\adapters.py", line 467, in send
low_conn.endheaders()
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 1239, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 1026, in _send_output
self.send(msg)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 966, in send
self.connect()
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 183, in connect
conn = self._new_conn()
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 169, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x0D9A51F0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed
如果您手动测试 WebHDFS CREATE
命令,您会看到它重定向到 datanode
.
curl -i -X PUT "http://localhost:50070/webhdfs/v1/tmp/test.txt?user.name=hadoop&op=CREATE"
HTTP/1.1 307 TEMPORARY_REDIRECT
Cache-Control: no-cache
Expires: Wed, 17 Jul 2019 17:16:00 GMT
Date: Wed, 17 Jul 2019 17:16:00 GMT
Pragma: no-cache
Expires: Wed, 17 Jul 2019 17:16:00 GMT
Date: Wed, 17 Jul 2019 17:16:00 GMT
Pragma: no-cache
Set-Cookie: hadoop.auth="u=hadoop&p=hadoop&t=simple&e=1563419760195&s=P2msnW447qKKXqfKcsEaTWSXnI0="; Path=/; Expires=Thu, 18-Jul-2019 03:16:00 GMT; HttpOnly
Location: http://datanode:50075/webhdfs/v1/tmp/test.txt?op=CREATE&user.name=hadoop&namenoderpcaddress=namenode:8020&overwrite=false
Content-Type: application/octet-stream
Content-Length: 0
Server: Jetty(6.1.26)
来自 WebHDFS 的响应正试图将您重定向到 Hadoop datanode
注意响应中的位置:http://5fbeb0287619:50075。
这是错误的!这是我的 docker 容器的 ID,因为没有设置 host-name。
- 确保数据节点可访问
- 确保主机名正确并且可以从名称节点和您执行脚本的位置解析。
在我的例子中,我使用的是 Docker,所以我需要在我的 docker-compose.yml
脚本中明确设置我的 hostname
。完成此操作后,一切正常。
当使用 Python hdfs
库提交请求时,我收到以下失败消息。
Traceback (most recent call last):
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 160, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\util\connection.py", line 57, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\socket.py", line 748, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "hdfs_test.py", line 128, in <module>
sys.exit(main(sys.argv))
File "hdfs_test.py", line 108, in main
hdfs_stream.write(raw_bytes)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\hdfs\util.py", line 104, in __exit__
raise self._err # pylint: disable=raising-bad-type
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\hdfs\util.py", line 76, in consumer
self._consumer(data)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\hdfs\client.py", line 469, in consumer
data=(c.encode(encoding) for c in _data) if encoding else _data,
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\hdfs\client.py", line 214, in _request
**kwargs
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\adapters.py", line 467, in send
low_conn.endheaders()
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 1239, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 1026, in _send_output
self.send(msg)
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 966, in send
self.connect()
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 183, in connect
conn = self._new_conn()
File "C:\Users3041\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 169, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x0D9A51F0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed
如果您手动测试 WebHDFS CREATE
命令,您会看到它重定向到 datanode
.
curl -i -X PUT "http://localhost:50070/webhdfs/v1/tmp/test.txt?user.name=hadoop&op=CREATE"
HTTP/1.1 307 TEMPORARY_REDIRECT
Cache-Control: no-cache
Expires: Wed, 17 Jul 2019 17:16:00 GMT
Date: Wed, 17 Jul 2019 17:16:00 GMT
Pragma: no-cache
Expires: Wed, 17 Jul 2019 17:16:00 GMT
Date: Wed, 17 Jul 2019 17:16:00 GMT
Pragma: no-cache
Set-Cookie: hadoop.auth="u=hadoop&p=hadoop&t=simple&e=1563419760195&s=P2msnW447qKKXqfKcsEaTWSXnI0="; Path=/; Expires=Thu, 18-Jul-2019 03:16:00 GMT; HttpOnly
Location: http://datanode:50075/webhdfs/v1/tmp/test.txt?op=CREATE&user.name=hadoop&namenoderpcaddress=namenode:8020&overwrite=false
Content-Type: application/octet-stream
Content-Length: 0
Server: Jetty(6.1.26)
来自 WebHDFS 的响应正试图将您重定向到 Hadoop datanode
注意响应中的位置:http://5fbeb0287619:50075。
这是错误的!这是我的 docker 容器的 ID,因为没有设置 host-name。
- 确保数据节点可访问
- 确保主机名正确并且可以从名称节点和您执行脚本的位置解析。
在我的例子中,我使用的是 Docker,所以我需要在我的 docker-compose.yml
脚本中明确设置我的 hostname
。完成此操作后,一切正常。