urllib.error.HTTPError: HTTP Error 404: Not Found Python while scraping data from Metacritic
urllib.error.HTTPError: HTTP Error 404: Not Found Python while scraping data from Metacritic
我正在尝试从 Metacritic 抓取电影评级。这是抛出错误的代码部分。
text = text.replace("_","-")
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7'
headers={'User-Agent':user_agent,}
URL = "http://metacritic.com/" + text
request=urllib.request.Request(URL,None,headers)
try:
response = urllib.request.urlopen(request)
data = response.read()
soup = BeautifulSoup(data,'html.parser')
metacritic_rating = "Metascore: " + soup.find("span",class_="metascore_w").get_text()
send_message(metacritic_rating,chat)
except:
pass
我修改了我用这个写的东西:
我不能使用 requests.get()
因为这个:
我正在寻找获取页面状态代码的方法。当我使用 requests.get()
时,我能够找到一种方法。
我查看了标题为 urllib.error.HTTPError: HTTP Error 404: Not Found Python
的所有答案,但找不到任何帮助。
感谢任何帮助。
我想这就是你想要的:
import urllib
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7'
headers={'User-Agent':user_agent,}
URL = "http://metacritic.com/" + text
request=urllib.request.Request(URL,None,headers)
try:
response = urllib.request.urlopen(request)
data = response.read()
soup = BeautifulSoup(data,'html.parser')
metacritic_rating = "Metascore: " + soup.find("span",class_="metascore_w").get_text()
send_message(metacritic_rating,chat)
except urllib.error.HTTPError as err:
#print(err.code)
if err.code == 403:
<do something>
else:
pass
输出:
403
我正在尝试从 Metacritic 抓取电影评级。这是抛出错误的代码部分。
text = text.replace("_","-")
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7'
headers={'User-Agent':user_agent,}
URL = "http://metacritic.com/" + text
request=urllib.request.Request(URL,None,headers)
try:
response = urllib.request.urlopen(request)
data = response.read()
soup = BeautifulSoup(data,'html.parser')
metacritic_rating = "Metascore: " + soup.find("span",class_="metascore_w").get_text()
send_message(metacritic_rating,chat)
except:
pass
我修改了我用这个写的东西:
我不能使用 requests.get()
因为这个:
我正在寻找获取页面状态代码的方法。当我使用 requests.get()
时,我能够找到一种方法。
我查看了标题为 urllib.error.HTTPError: HTTP Error 404: Not Found Python
的所有答案,但找不到任何帮助。
感谢任何帮助。
我想这就是你想要的:
import urllib
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7'
headers={'User-Agent':user_agent,}
URL = "http://metacritic.com/" + text
request=urllib.request.Request(URL,None,headers)
try:
response = urllib.request.urlopen(request)
data = response.read()
soup = BeautifulSoup(data,'html.parser')
metacritic_rating = "Metascore: " + soup.find("span",class_="metascore_w").get_text()
send_message(metacritic_rating,chat)
except urllib.error.HTTPError as err:
#print(err.code)
if err.code == 403:
<do something>
else:
pass
输出:
403