打开 urllib2 握手失败的页面
Opening a page with urllib2 handshake failure
我只是想打开一个网页:https://close5.com/home/
而且我不断收到关于我的 ssl 的不同错误。这是我的一些尝试和他们的错误。我愿意接受适用于任一框架的修复程序。我的最终目标是使用将此页面变成 beautifulsoup4 汤。
错误:
Traceback (most recent call last):
File "test.py", line 54, in <module>
print soup_maker_two(url)
File "test.py", line 45, in soup_maker_two
response = br.open(url)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 203, in open
return self._mech_open(url, data, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 230, in _mech_open
response = UserAgentBase.open(self, request, data)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_opener.py", line 193, in open
response = urlopen(self, req, data)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_urllib2_fork.py", line 344, in _open
'_open', req)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_urllib2_fork.py", line 332, in _call_chain
result = func(*args)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_urllib2_fork.py", line 1170, in https_open
return self.do_open(conn_factory, req)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_urllib2_fork.py", line 1118, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:510: error:14094410:SSL routines:SSL3_READ_BYTES:sslv3 alert handshake failure>
代码:
import mechanize
import ssl
from functools import wraps
def sslwrap(func):
@wraps(func)
def bar(*args, **kw):
kw['ssl_version'] = ssl.PROTOCOL_TLSv1
return func(*args, **kw)
return bar
def soup_maker_two(url):
br = mechanize.Browser()
br.set_handle_robots(False)
br.set_handle_equiv(False)
br.set_handle_refresh(False)
br.addheaders = [('User-agent', 'Firefox')]
ssl.wrap_socket = sslwrap(ssl.wrap_socket)
response = br.open(url)
for f in br.forms():
print f
return 'hi'
if __name__ == "__main__":
url = 'https://close5.com/'
print soup_maker_two(url)
我也试过得到这个错误和代码组合
第二次尝试
错误:
Traceback (most recent call last):
File "test.py", line 29, in <module>
print str(soup_maker(url))[0:1000]
File "test.py", line 22, in soup_maker
webpage = opener.open(req)
File "/usr/lib/python2.7/urllib2.py", line 404, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 422, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1222, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1184, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:510: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure>
代码:
from bs4 import BeautifulSoup
import urllib2
def soup_maker(url):
class RedirectHandler(urllib2.HTTPRedirectHandler):
def http_error_302(self, req, fp, code, msg, headers):
result = urllib2.HTTPError(req.get_full_url(), code, msg, headers, fp)
result.status = code
return result
hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
'Accept-Encoding': 'none',
'Accept-Language': 'en-US,en;q=0.8',
'Connection': 'keep-alive'}
req = urllib2.Request(url,headers=hdr)
opener = urllib2.build_opener(RedirectHandler())
webpage = opener.open(req)
soup = BeautifulSoup(webpage, "html5lib")
return soup
if __name__ == "__main__":
url = 'https://close5.com/home/'
print str(soup_maker(url))[0:1000]
编辑 1
从 bs4 导入 BeautifulSoup
有人建议我使用:
def soup_maker(url):
soup = BeautifulSoup(requests.get(url).content, "html5lib")
return soup
if __name__ == "__main__":
import requests
url = 'https://close5.com/home/'
print str(soup_maker(url))[:1000]
此代码适用于 Padraic,但不适用于我。我收到错误:
Traceback (most recent call last):
File "test_3.py", line 10, in <module>
print str(soup_maker(url))[:1000]
File "test_3.py", line 4, in soup_maker
soup = BeautifulSoup(requests.get(url).content, "html5lib")
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 455, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 558, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/adapters.py", line 385, in send
raise SSLError(e)
requests.exceptions.SSLError: [Errno 1] _ssl.c:510: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure
和之前一样的错误。我猜这可能与我使用的是 Python 2.7.6 有关,但我不确定。另外,我不确定如何使用该信息来解决我的问题。
编辑 2
问题可能在于请求的版本不正确。目前我的 pip freeze
中有 requests==2.2.1
sudo pip install -U requests
returns
Downloading/unpacking requests from https://pypi.python.org/packages/2.7/r/requests/requests-2.9.1-py2.py3-none-any.whl#md5=58a444aaa02780ad01983f5f540e67b2
Downloading requests-2.9.1-py2.py3-none-any.whl (501kB): 501kB downloaded
Installing collected packages: requests
Found existing installation: requests 2.2.1
Not uninstalling requests at /usr/lib/python2.7/dist-packages, owned by OS
Successfully installed requests
Cleaning up..
sudo pip2 install -U requests
returns一样
sudo pip uninstall requests
returns
Not uninstalling requests at /usr/lib/python2.7/dist-packages, owned by OS
我是 运行 ubuntu 14.04 和 python 2.7.6 并请求 2.2.1
编辑 3
sudo pip install --ignore-installed requests
给予
Downloading/unpacking requests
Downloading requests-2.9.1-py2.py3-none-any.whl (501kB): 501kB downloaded
Installing collected packages: requests
Successfully installed requests
Cleaning up...
但 sudo pip freeze
仍然给出 requests==2.2.1
编辑 4
经过很多建议后,我现在有了
$python
Python 2.7.6 (default, Jun 22 2015, 18:00:18)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests;requests.__version__
'2.9.1'
>>> url = 'https://close5.com/home/'
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(requests.get(url).content, "html5lib")
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:315: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
SNIMissingWarning
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:120: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 67, in get
return request('get', url, params=params, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 53, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 447, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: [Errno 1] _ssl.c:510: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure
>>>
我建议使用 requests:
def soup_maker(url):
soup = BeautifulSoup(requests.get(url).content)
return soup
if __name__ == "__main__":
import requests
url = 'https://close5.com/home/'
print str(soup_maker(url))[:1000]
这会给你你需要的东西:
<html><head><title>Buy & Sell Locally with Close5</title><meta content="Close5 provides a safe and easy environment to list your items and sell them fast. Shop cars, home goods and Children's items locally with Close5" name="description"/><meta content="index, follow" name="robots"/><!--link(rel="canonical" href="https://www.close5.com")-->
<link href="https://www.close5.com/images/favicons/favicon-160x160.png" rel="image_src"/><meta content="index, follow" name="robots"/><!-- Facebook Item Tags--><meta content="Buy & Sell Locally with Close5" property="og:title"/><meta content="Close5" property="og:site_name"/><!-- meta(property="og:url" content='https://www.close5.com/images/app-icon.png')--><meta content="Close5 provides a safe and easy environment to list your items and sell them fast. Shop cars, home goods and Children's items locally with Close5" property="og:description"/><meta content="1470902013158927" property="fb:app_id"/><meta content="100000228184034" property="fb:
编辑 1:
你的pip版本太旧了,升级pip install -U requests
编辑2:
您使用 apt-get 安装了请求,因此您需要:
apt-get remove python-requests
pip install --ignore-installed requests # pip install -U requests should also work
我会完全删除 pip 并下载 get-pip.py、运行 python get-pip.py
并坚持使用 pip 安装软件包。 pip 很可能已成功安装请求,较新的版本可能在您的路径中更靠后。
编辑 3:
您使用 apt-get 安装了请求,因此您无法使用 pip 删除它,请按照 Edit2 中的建议使用 apt-get remove python-requests
。
编辑4:
输出中的 link 解释了正在发生的事情并建议:
pip install pyopenssl ndg-httpsclient pyasn1
您还可以:
pip install requests[security]
我只是想打开一个网页:https://close5.com/home/
而且我不断收到关于我的 ssl 的不同错误。这是我的一些尝试和他们的错误。我愿意接受适用于任一框架的修复程序。我的最终目标是使用将此页面变成 beautifulsoup4 汤。
错误:
Traceback (most recent call last):
File "test.py", line 54, in <module>
print soup_maker_two(url)
File "test.py", line 45, in soup_maker_two
response = br.open(url)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 203, in open
return self._mech_open(url, data, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 230, in _mech_open
response = UserAgentBase.open(self, request, data)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_opener.py", line 193, in open
response = urlopen(self, req, data)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_urllib2_fork.py", line 344, in _open
'_open', req)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_urllib2_fork.py", line 332, in _call_chain
result = func(*args)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_urllib2_fork.py", line 1170, in https_open
return self.do_open(conn_factory, req)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_urllib2_fork.py", line 1118, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:510: error:14094410:SSL routines:SSL3_READ_BYTES:sslv3 alert handshake failure>
代码:
import mechanize
import ssl
from functools import wraps
def sslwrap(func):
@wraps(func)
def bar(*args, **kw):
kw['ssl_version'] = ssl.PROTOCOL_TLSv1
return func(*args, **kw)
return bar
def soup_maker_two(url):
br = mechanize.Browser()
br.set_handle_robots(False)
br.set_handle_equiv(False)
br.set_handle_refresh(False)
br.addheaders = [('User-agent', 'Firefox')]
ssl.wrap_socket = sslwrap(ssl.wrap_socket)
response = br.open(url)
for f in br.forms():
print f
return 'hi'
if __name__ == "__main__":
url = 'https://close5.com/'
print soup_maker_two(url)
我也试过得到这个错误和代码组合
第二次尝试
错误:
Traceback (most recent call last):
File "test.py", line 29, in <module>
print str(soup_maker(url))[0:1000]
File "test.py", line 22, in soup_maker
webpage = opener.open(req)
File "/usr/lib/python2.7/urllib2.py", line 404, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 422, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1222, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1184, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:510: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure>
代码:
from bs4 import BeautifulSoup
import urllib2
def soup_maker(url):
class RedirectHandler(urllib2.HTTPRedirectHandler):
def http_error_302(self, req, fp, code, msg, headers):
result = urllib2.HTTPError(req.get_full_url(), code, msg, headers, fp)
result.status = code
return result
hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
'Accept-Encoding': 'none',
'Accept-Language': 'en-US,en;q=0.8',
'Connection': 'keep-alive'}
req = urllib2.Request(url,headers=hdr)
opener = urllib2.build_opener(RedirectHandler())
webpage = opener.open(req)
soup = BeautifulSoup(webpage, "html5lib")
return soup
if __name__ == "__main__":
url = 'https://close5.com/home/'
print str(soup_maker(url))[0:1000]
编辑 1
从 bs4 导入 BeautifulSoup 有人建议我使用:
def soup_maker(url):
soup = BeautifulSoup(requests.get(url).content, "html5lib")
return soup
if __name__ == "__main__":
import requests
url = 'https://close5.com/home/'
print str(soup_maker(url))[:1000]
此代码适用于 Padraic,但不适用于我。我收到错误:
Traceback (most recent call last):
File "test_3.py", line 10, in <module>
print str(soup_maker(url))[:1000]
File "test_3.py", line 4, in soup_maker
soup = BeautifulSoup(requests.get(url).content, "html5lib")
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 455, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 558, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/adapters.py", line 385, in send
raise SSLError(e)
requests.exceptions.SSLError: [Errno 1] _ssl.c:510: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure
和之前一样的错误。我猜这可能与我使用的是 Python 2.7.6 有关,但我不确定。另外,我不确定如何使用该信息来解决我的问题。
编辑 2
问题可能在于请求的版本不正确。目前我的 pip freeze
requests==2.2.1
sudo pip install -U requests
returns
Downloading/unpacking requests from https://pypi.python.org/packages/2.7/r/requests/requests-2.9.1-py2.py3-none-any.whl#md5=58a444aaa02780ad01983f5f540e67b2
Downloading requests-2.9.1-py2.py3-none-any.whl (501kB): 501kB downloaded
Installing collected packages: requests
Found existing installation: requests 2.2.1
Not uninstalling requests at /usr/lib/python2.7/dist-packages, owned by OS
Successfully installed requests
Cleaning up..
sudo pip2 install -U requests
returns一样
sudo pip uninstall requests
returns
Not uninstalling requests at /usr/lib/python2.7/dist-packages, owned by OS
我是 运行 ubuntu 14.04 和 python 2.7.6 并请求 2.2.1
编辑 3
sudo pip install --ignore-installed requests
给予
Downloading/unpacking requests
Downloading requests-2.9.1-py2.py3-none-any.whl (501kB): 501kB downloaded
Installing collected packages: requests
Successfully installed requests
Cleaning up...
但 sudo pip freeze
仍然给出 requests==2.2.1
编辑 4
经过很多建议后,我现在有了
$python
Python 2.7.6 (default, Jun 22 2015, 18:00:18)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests;requests.__version__
'2.9.1'
>>> url = 'https://close5.com/home/'
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(requests.get(url).content, "html5lib")
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:315: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
SNIMissingWarning
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:120: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 67, in get
return request('get', url, params=params, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 53, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 447, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: [Errno 1] _ssl.c:510: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure
>>>
我建议使用 requests:
def soup_maker(url):
soup = BeautifulSoup(requests.get(url).content)
return soup
if __name__ == "__main__":
import requests
url = 'https://close5.com/home/'
print str(soup_maker(url))[:1000]
这会给你你需要的东西:
<html><head><title>Buy & Sell Locally with Close5</title><meta content="Close5 provides a safe and easy environment to list your items and sell them fast. Shop cars, home goods and Children's items locally with Close5" name="description"/><meta content="index, follow" name="robots"/><!--link(rel="canonical" href="https://www.close5.com")-->
<link href="https://www.close5.com/images/favicons/favicon-160x160.png" rel="image_src"/><meta content="index, follow" name="robots"/><!-- Facebook Item Tags--><meta content="Buy & Sell Locally with Close5" property="og:title"/><meta content="Close5" property="og:site_name"/><!-- meta(property="og:url" content='https://www.close5.com/images/app-icon.png')--><meta content="Close5 provides a safe and easy environment to list your items and sell them fast. Shop cars, home goods and Children's items locally with Close5" property="og:description"/><meta content="1470902013158927" property="fb:app_id"/><meta content="100000228184034" property="fb:
编辑 1:
你的pip版本太旧了,升级pip install -U requests
编辑2:
您使用 apt-get 安装了请求,因此您需要:
apt-get remove python-requests
pip install --ignore-installed requests # pip install -U requests should also work
我会完全删除 pip 并下载 get-pip.py、运行 python get-pip.py
并坚持使用 pip 安装软件包。 pip 很可能已成功安装请求,较新的版本可能在您的路径中更靠后。
编辑 3:
您使用 apt-get 安装了请求,因此您无法使用 pip 删除它,请按照 Edit2 中的建议使用 apt-get remove python-requests
。
编辑4:
输出中的 link 解释了正在发生的事情并建议:
pip install pyopenssl ndg-httpsclient pyasn1
您还可以:
pip install requests[security]