通过 API 和 Python 解析数据。不记名令牌

Parsing data via API with Python. Bearer tokens

我正在尝试构建一个项目,其中 prog 自动解析来自 API 的数据文件,执行计算并 returns 我一些输出。而且我发现自己无法解析来自 API 的数据。因此,我尝试访问的网站接下来显示:

    **Authentication and API Token
    USDA ESMIS provides the api-token to all users. Once you have created and confirmed your account, you can request the api-token by making a POST request to /user-token.

    A curl request to get the api-token:
    curl -X POST "https://usda.library.cornell.edu/user_token" -d '{"auth": {"email":"john.smith@example.com","password":"password"}}' -H "Content-Type: application/json"
    
Authorization and API Requests
    To access the API, all requests need an api-token to be passed in the Authorization request header as a bearer token.
    
Authorization: Bearer api-token
    You can also get the api-token below using the POST request to /user_token. Use the api-token with the Authorize feature on this page to test the API.**

link to the full text.

我正在将 curl 请求转换为:

import requests

headers = {
 'accept': '*/*',
 'Content-Type': 'multipart/form-data:',
}

data = '{auth[email]:aaaa,auth[password]:aaaaaa}'

response = requests.post('https://usda.library.cornell.edu/user_token', headers=headers, data=data)

但只收到 400 错误。据我了解,问题出在 Bearer 令牌中。我试图找到一些关于这个主题的常见问题解答或教程,但没有成功。能否请您总体上建议我通过 API 和 Python 使用此类标记解析数据并推荐一些资源来了解它?

我认为美国农业部的页面有点混乱。至少,您从中提取 data = '{auth[email]:aaaa,auth[password]:aaaaaa}' 的表单与我所知道的任何 Curl 或 python.requests 选项都不对应。并不是说我知道其中的大部分选项,所以我可能是错的,但它对我也不起作用。完美工作的是您引用的代码示例中显示的模型:

   A curl request to get the api-token:
   curl -X POST "https://usda.library.cornell.edu/user_token" -d '{"auth": {"email":"john.smith@example.com","password":"password"}}' -H "Content-Type: application/json"

对应于 Python 片段:

token = requests.post("https://usda.library.cornell.edu/user_token",
                      json={'auth':
                               {'email': 'john.smith@example.com',
                                'password':'password'}})

这将产生一个 JSON 响应,您可以从响应 object 的 json 方法中得到它:

bearer = token.json()['jwt']

bearer 将是一个类似于 'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOjg2MDN9.CWZPPpzCGj8qnOrHow8eJmDkzn5sSpSoFPffgq57Ayo' 的字符串,这是您需要为 API 请求提供的内容。

在我看来,最简单的方法就是直接提供 header。 (毫无疑问,requests 提供了执行此操作的机制,因此如果您搜索文档,您可能会找到它。我没有这样做,因为手工操作非常容易。)

data = requests.get(
    'https://usda.library.cornell.edu/api/v1/publication/search?q=Avocado',
    headers={'Authorization': 'Bearer '+token.json()['jwt']})

同样,使用json方法提取信息最简单:

>>> import pprint
>>> pprint.pprint(data.json())
[{'agency': ['National Agricultural Statistics Service'],
  'agency_acronym': ['NASS'],
  'contact_email': ['nass@nass.usda.gov'],
  'contact_organization': ['National Agricultural Statistics Service'],
  'description': ['This special publication reports on the damage done to the '
                  'citrus, avocado, vegetable, and sugar cane crops in Florida '
                  'following Hurricane Cleo in 1964. '],
  'frequency': ['Seasonal'],
  'id': 'dv13zt23r',
  'identifier': ['SpecHurrDa'],
  'keywords': ['Citrus',
               'hurricanes',
               'weather',
               'sugarcane',
               'avocados',
               'vegetables'],
  'resource_type': ['Report'],
  'status': ['Inactive'],
  'subject': ['Crops and Crop Products:Sugar Crops',
              'Crops and Crop Products:Fruits',
              'Agriculture Economics and Management:Weather',
              'Crops and Crop Products:Vegetables and Pulses'],
  'subscribable': 'No',
  'title': ['Special Hurricane Damage Report: August 26-27, 1964']},
   
   ...