检索 JSON 查询结果后代码中断
Interruption of code after retrieving JSON query-result
我已尝试解决此问题,但在 Python 中引发错误后,(对我而言)无法继续下一步。
我正在查询这个网站:https://w.wiki/msg
我通过更改每个循环的城市来调整查询,城市在 [listElements] 内。
当我有一个像“Awaradam”这样的城市时,代码会中断。 (你基本上可以硬编码它而不是 listElement)
尝试在里面放一个睡眠计时器并没有解决问题(我想我经常尝试请求)。
错误如下:
Traceback (most recent call last):
File "C:/Users/xxx/PycharmProjects/pythonProject3/xxx.py", line 30, in <module>
data = r.json()
File "C:\ProgramData\Anaconda3\envs\pythonProject3\lib\site-packages\requests\models.py", line 898, in json
return complexjson.loads(self.text, **kwargs)
File "C:\ProgramData\Anaconda3\envs\pythonProject3\lib\json\__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "C:\ProgramData\Anaconda3\envs\pythonProject3\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\ProgramData\Anaconda3\envs\pythonProject3\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
代码(我编辑了它,所以它可以复制,到目前为止,有这样的代码是没有意义的,在一定的循环之后它只是中断):
import requests
listPops = [[], []]
url = 'https://query.wikidata.org/sparql'
zaehler = -1
for i in range(100):
zaehler = zaehler + 1
#print(str(listElements[1][i]))
#query = r"SELECT ?population WHERE { SERVICE wikibase:mwapi {bd:serviceParam mwapi:search '" + str(listElements[1][i]) + "' . bd:serviceParam mwapi:language 'en' . bd:serviceParam wikibase:api 'EntitySearch' . bd:serviceParam wikibase:endpoint 'www.wikidata.org' . bd:serviceParam wikibase:limit 1 . ?item wikibase:apiOutputItem mwapi:item .} ?item wdt:P1082 ?population} "
query = """ SELECT ?population WHERE { SERVICE wikibase:mwapi {
bd:serviceParam mwapi:search '""" + "Awaradam" + """'.
bd:serviceParam mwapi:language "en" .
bd:serviceParam wikibase:api "EntitySearch" .
bd:serviceParam wikibase:endpoint "www.wikidata.org" .
bd:serviceParam wikibase:limit 1 .
?item wikibase:apiOutputItem mwapi:item .
}
?item wdt:P1082 ?population
}
"""
r = requests.get(url, params={'format': 'json', 'query': query}, timeout=10)
#time.sleep(5)
data = r.json()
try:
#population = r['results']['bindings'][0]['population']['value']
if data['results']['bindings'][0]['population']['value']:
population = data['results']['bindings'][0]['population']['value']
print(str(zaehler) + ": " + "Population in " + str(listElements[1][i]) + ": " + f"{int(population):,}")
listPops[0].append(str(listElements[1][i]))
listPops[1].append(population)
except:
continue
print('Finished scrape.')
回溯意味着你得到的结果不是JSON。你不能让远程服务器发送 JSON 如果它不想,但你可以跳过这个项目(或者尝试不同的查询,如果你能找出一个可行的)。
try:
data = r.json()
except json.decoder.JSONDecodeError as err:
logging.warning('Not JSON: %s (result %r)', err, r.text)
continue
您将不得不 import logging
(或者只是 print
警告)和 import json
如果您还没有这样做的话。
你的毯子 try
/ except
也可以工作(只需将 try
移到失败线上方),但它真的很糟糕。参见 Why is "except: pass" a bad programming practice?。在实践中,它掩盖了维基数据中没有 Awaradam 的结果这一事实,而你是 运行 一个无果而终的循环,试图一次又一次地获取它们。
这里有一个快速而肮脏的修复方法:
import requests
import time
import json
listPops = [[], []]
listElements = [[], ['Bangalore', 'Hyderabad', 'Awaradam', 'Rawalpindi']]
url = 'https://query.wikidata.org/sparql'
for i, city in enumerate(listElements[1]):
query = """ SELECT ?population WHERE { SERVICE wikibase:mwapi {
bd:serviceParam mwapi:search '""" + city + """'.
bd:serviceParam mwapi:language "en" .
bd:serviceParam wikibase:api "EntitySearch" .
bd:serviceParam wikibase:endpoint "www.wikidata.org" .
bd:serviceParam wikibase:limit 1 .
?item wikibase:apiOutputItem mwapi:item .
}
?item wdt:P1082 ?population
}
"""
r = requests.get(url, params={'format': 'json', 'query': query}, timeout=10)
time.sleep(5)
try:
data = r.json()
except json.decoder.JSONDecodeError as err:
print('Not JSON: %s (result %r)' % (err, r.text))
assert 'results' in data
assert 'bindings' in data['results']
if not data['results']['bindings']:
#logging.warning('No results for %s', city)
print('No results for', city)
continue
assert data['results']['bindings'], 'type %s %r' % (type(data['results']['bindings']), data['results']['bindings'])
assert 'population' in data['results']['bindings'][0]
assert 'value' in data['results']['bindings'][0]['population']
if data['results']['bindings'][0]['population']['value']:
population = data['results']['bindings'][0]['population']['value']
print(f"{i}: Population in {city}: {int(population):,}")
listPops[0].append(str(listElements[1][i]))
listPops[1].append(population)
正如@tripleee 所提到的,问题是您查询的不是 return 有效的 JSON(而是 return HTML 消息)。服务器应在您的查询 status 时通知您。要处理它,您应该检查请求的状态:
r = requests.get(url, params={'format': 'json', 'query': query}, timeout=10)
if r.status_code != 200:
handle_your_error(r)
例如,在 运行 你的例子之后我得到了 HTTP 错误 429:请求太多。
我已尝试解决此问题,但在 Python 中引发错误后,(对我而言)无法继续下一步。
我正在查询这个网站:https://w.wiki/msg 我通过更改每个循环的城市来调整查询,城市在 [listElements] 内。 当我有一个像“Awaradam”这样的城市时,代码会中断。 (你基本上可以硬编码它而不是 listElement)
尝试在里面放一个睡眠计时器并没有解决问题(我想我经常尝试请求)。
错误如下:
Traceback (most recent call last):
File "C:/Users/xxx/PycharmProjects/pythonProject3/xxx.py", line 30, in <module>
data = r.json()
File "C:\ProgramData\Anaconda3\envs\pythonProject3\lib\site-packages\requests\models.py", line 898, in json
return complexjson.loads(self.text, **kwargs)
File "C:\ProgramData\Anaconda3\envs\pythonProject3\lib\json\__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "C:\ProgramData\Anaconda3\envs\pythonProject3\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\ProgramData\Anaconda3\envs\pythonProject3\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
代码(我编辑了它,所以它可以复制,到目前为止,有这样的代码是没有意义的,在一定的循环之后它只是中断):
import requests
listPops = [[], []]
url = 'https://query.wikidata.org/sparql'
zaehler = -1
for i in range(100):
zaehler = zaehler + 1
#print(str(listElements[1][i]))
#query = r"SELECT ?population WHERE { SERVICE wikibase:mwapi {bd:serviceParam mwapi:search '" + str(listElements[1][i]) + "' . bd:serviceParam mwapi:language 'en' . bd:serviceParam wikibase:api 'EntitySearch' . bd:serviceParam wikibase:endpoint 'www.wikidata.org' . bd:serviceParam wikibase:limit 1 . ?item wikibase:apiOutputItem mwapi:item .} ?item wdt:P1082 ?population} "
query = """ SELECT ?population WHERE { SERVICE wikibase:mwapi {
bd:serviceParam mwapi:search '""" + "Awaradam" + """'.
bd:serviceParam mwapi:language "en" .
bd:serviceParam wikibase:api "EntitySearch" .
bd:serviceParam wikibase:endpoint "www.wikidata.org" .
bd:serviceParam wikibase:limit 1 .
?item wikibase:apiOutputItem mwapi:item .
}
?item wdt:P1082 ?population
}
"""
r = requests.get(url, params={'format': 'json', 'query': query}, timeout=10)
#time.sleep(5)
data = r.json()
try:
#population = r['results']['bindings'][0]['population']['value']
if data['results']['bindings'][0]['population']['value']:
population = data['results']['bindings'][0]['population']['value']
print(str(zaehler) + ": " + "Population in " + str(listElements[1][i]) + ": " + f"{int(population):,}")
listPops[0].append(str(listElements[1][i]))
listPops[1].append(population)
except:
continue
print('Finished scrape.')
回溯意味着你得到的结果不是JSON。你不能让远程服务器发送 JSON 如果它不想,但你可以跳过这个项目(或者尝试不同的查询,如果你能找出一个可行的)。
try:
data = r.json()
except json.decoder.JSONDecodeError as err:
logging.warning('Not JSON: %s (result %r)', err, r.text)
continue
您将不得不 import logging
(或者只是 print
警告)和 import json
如果您还没有这样做的话。
你的毯子 try
/ except
也可以工作(只需将 try
移到失败线上方),但它真的很糟糕。参见 Why is "except: pass" a bad programming practice?。在实践中,它掩盖了维基数据中没有 Awaradam 的结果这一事实,而你是 运行 一个无果而终的循环,试图一次又一次地获取它们。
这里有一个快速而肮脏的修复方法:
import requests
import time
import json
listPops = [[], []]
listElements = [[], ['Bangalore', 'Hyderabad', 'Awaradam', 'Rawalpindi']]
url = 'https://query.wikidata.org/sparql'
for i, city in enumerate(listElements[1]):
query = """ SELECT ?population WHERE { SERVICE wikibase:mwapi {
bd:serviceParam mwapi:search '""" + city + """'.
bd:serviceParam mwapi:language "en" .
bd:serviceParam wikibase:api "EntitySearch" .
bd:serviceParam wikibase:endpoint "www.wikidata.org" .
bd:serviceParam wikibase:limit 1 .
?item wikibase:apiOutputItem mwapi:item .
}
?item wdt:P1082 ?population
}
"""
r = requests.get(url, params={'format': 'json', 'query': query}, timeout=10)
time.sleep(5)
try:
data = r.json()
except json.decoder.JSONDecodeError as err:
print('Not JSON: %s (result %r)' % (err, r.text))
assert 'results' in data
assert 'bindings' in data['results']
if not data['results']['bindings']:
#logging.warning('No results for %s', city)
print('No results for', city)
continue
assert data['results']['bindings'], 'type %s %r' % (type(data['results']['bindings']), data['results']['bindings'])
assert 'population' in data['results']['bindings'][0]
assert 'value' in data['results']['bindings'][0]['population']
if data['results']['bindings'][0]['population']['value']:
population = data['results']['bindings'][0]['population']['value']
print(f"{i}: Population in {city}: {int(population):,}")
listPops[0].append(str(listElements[1][i]))
listPops[1].append(population)
正如@tripleee 所提到的,问题是您查询的不是 return 有效的 JSON(而是 return HTML 消息)。服务器应在您的查询 status 时通知您。要处理它,您应该检查请求的状态:
r = requests.get(url, params={'format': 'json', 'query': query}, timeout=10)
if r.status_code != 200:
handle_your_error(r)
例如,在 运行 你的例子之后我得到了 HTTP 错误 429:请求太多。