urlerror 和 ssl.CertificateError
urlerror and ssl.CertificateError
我有以下代码:
from urllib.request import urlopen
from urllib.error import HTTPError, URLError
from bs4 import BeautifulSoup
# target = "https://www.rolcruise.co.uk/cruise-detail/1158731-hawaii-round-trip-honolulu-2020-05-23"
target = "https://www.rolcruise.co.uk"
try:
html = urlopen(target)
except HTTPError as e:
print("You got a HTTP Error. Something wrong with the path.")
print("Here is the error code: " + str(e.code))
print("Here is the error reason: " + e.reason)
print("Happy for the program to end here"
except URLError as e:
print("You got a URL Error. Something wrong with the URL.")
print("Here is the error reason: " + str(e.reason))
print("Happy for the program to end here")
else:
bs_obj = BeautifulSoup(html, features="lxml")
print(bs_obj)
如果我故意在输入 url 的某些部分时出错,url 错误处理会正常工作,即如果我故意输入 "htps" 而不是 "https",或 "ww" 而不是 "www",或 "u" 而不是 "uk"。
例如
target = "https://www.rolcruise.co.u"
但是,如果主机名 ("rolcruise") 或 url 的 "co" 部分输入有误,urlerror 将不起作用,并且我收到一条错误消息,内容为 ssl.CertificateError。
例如
target = "https://www.rolcruise.c.uk"
我不明白为什么 URLError 没有涵盖 url 中某处出现拼写错误的所有情况?
鉴于它正在发生,下一步如何处理ssl.CertificateError?
感谢您的帮助!
将 ssl 放入您的命名空间以开始:
import ssl
那么你就可以捕捉到那种异常:
try:
html = urlopen(target)
except HTTPError as e:
print("You got a HTTP Error. Something wrong with the path.")
print("Here is the error code: " + str(e.code))
print("Here is the error reason: " + e.reason)
print("Happy for the program to end here"
except URLError as e:
print("You got a URL Error. Something wrong with the URL.")
print("Here is the error reason: " + str(e.reason))
print("Happy for the program to end here")
except ssl.CertificateError:
# Do your stuff here...
else:
bs_obj = BeautifulSoup(html, features="lxml")
print(bs_obj)
我有以下代码:
from urllib.request import urlopen
from urllib.error import HTTPError, URLError
from bs4 import BeautifulSoup
# target = "https://www.rolcruise.co.uk/cruise-detail/1158731-hawaii-round-trip-honolulu-2020-05-23"
target = "https://www.rolcruise.co.uk"
try:
html = urlopen(target)
except HTTPError as e:
print("You got a HTTP Error. Something wrong with the path.")
print("Here is the error code: " + str(e.code))
print("Here is the error reason: " + e.reason)
print("Happy for the program to end here"
except URLError as e:
print("You got a URL Error. Something wrong with the URL.")
print("Here is the error reason: " + str(e.reason))
print("Happy for the program to end here")
else:
bs_obj = BeautifulSoup(html, features="lxml")
print(bs_obj)
如果我故意在输入 url 的某些部分时出错,url 错误处理会正常工作,即如果我故意输入 "htps" 而不是 "https",或 "ww" 而不是 "www",或 "u" 而不是 "uk"。 例如
target = "https://www.rolcruise.co.u"
但是,如果主机名 ("rolcruise") 或 url 的 "co" 部分输入有误,urlerror 将不起作用,并且我收到一条错误消息,内容为 ssl.CertificateError。 例如
target = "https://www.rolcruise.c.uk"
我不明白为什么 URLError 没有涵盖 url 中某处出现拼写错误的所有情况?
鉴于它正在发生,下一步如何处理ssl.CertificateError?
感谢您的帮助!
将 ssl 放入您的命名空间以开始:
import ssl
那么你就可以捕捉到那种异常:
try:
html = urlopen(target)
except HTTPError as e:
print("You got a HTTP Error. Something wrong with the path.")
print("Here is the error code: " + str(e.code))
print("Here is the error reason: " + e.reason)
print("Happy for the program to end here"
except URLError as e:
print("You got a URL Error. Something wrong with the URL.")
print("Here is the error reason: " + str(e.reason))
print("Happy for the program to end here")
except ssl.CertificateError:
# Do your stuff here...
else:
bs_obj = BeautifulSoup(html, features="lxml")
print(bs_obj)