如何在 Python 中打印出 http-response header
How to print out http-response header in Python
今天我实际上需要从 http-header 响应中检索数据。但由于我以前从未做过,而且您在 Google 上也找不到太多关于此的内容。我决定在这里问我的问题。
所以实际问题:如何在 python 中打印 http-header 响应数据?我在 Python3.5 中使用请求模块工作,但尚未找到执行此操作的方法。
更新:根据 OP 的评论,仅需要响应 headers。在请求模块的以下文档中写得更简单:
We can view the server's response headers using a Python dictionary:
>>> r.headers
{
'content-encoding': 'gzip',
'transfer-encoding': 'chunked',
'connection': 'close',
'server': 'nginx/1.0.4',
'x-runtime': '148ms',
'etag': '"e1ca502697e5c9317743dc078f67693f"',
'content-type': 'application/json'
}
尤其是文档注释:
The dictionary is special, though: it's made just for HTTP headers. According to RFC 7230, HTTP Header names are case-insensitive.
So, we can access the headers using any capitalization we want:
并继续解释有关 RFC 合规性的更多聪明之处。
Using Response.iter_content will handle a lot of what you would otherwise have to handle when using Response.raw directly. When streaming a download, the above is the preferred and recommended way to retrieve the content.
举个例子:
>>> r = requests.get('https://api.github.com/events', stream=True)
>>> r.raw
<requests.packages.urllib3.response.HTTPResponse object at 0x101194810>
>>> r.raw.read(10)
'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03'
但也提供了有关如何通过重定向到文件等并使用不同方法在实践中执行此操作的建议:
Using Response.iter_content will handle a lot of what you would otherwise have to handle when using Response.raw directly
这样的事情怎么样:
import urllib2
req = urllib2.Request('http://www.google.com/')
res = urllib2.urlopen(req)
print res.info()
res.close();
如果您正在寻找 header 中的特定内容:
For Date: print res.info().get('Date')
我正在使用 urllib 模块,代码如下:
from urllib import request
with request.urlopen(url, data) as f:
print(f.getcode()) # http response code
print(f.info()) # all header info
resp_body = f.read().decode('utf-8') # response body
尝试使用req.headers
,仅此而已。您将收到回复 headers ;)
容易
import requests
site = "https://www.google.com"
headers = requests.get(site).headers
print(headers)
如果你想要一些特别的东西
print(headers["domain"])
以下是如何使用您提到的请求库仅响应headers(Python3中的实现):
import requests
url = "https://www.google.com"
response = requests.head(url)
print(response.headers) # prints the entire header as a dictionary
print(response.headers["Content-Length"]) # prints a specific section of the dictionary
使用 .head()
而不是 .get()
很重要,否则您将像提到的其余答案一样检索整个 file/page。
如果您希望检索需要身份验证的 URL,您可以将上面的 response
替换为:
response = requests.head(url, auth=requests.auth.HTTPBasicAuth(username, password))
你可以很容易地输入
print(response.headers)
或者我的最爱
print(requests.get('url').headers)
also u can use
print(requests.get('url').content)
今天我实际上需要从 http-header 响应中检索数据。但由于我以前从未做过,而且您在 Google 上也找不到太多关于此的内容。我决定在这里问我的问题。
所以实际问题:如何在 python 中打印 http-header 响应数据?我在 Python3.5 中使用请求模块工作,但尚未找到执行此操作的方法。
更新:根据 OP 的评论,仅需要响应 headers。在请求模块的以下文档中写得更简单:
We can view the server's response headers using a Python dictionary:
>>> r.headers
{
'content-encoding': 'gzip',
'transfer-encoding': 'chunked',
'connection': 'close',
'server': 'nginx/1.0.4',
'x-runtime': '148ms',
'etag': '"e1ca502697e5c9317743dc078f67693f"',
'content-type': 'application/json'
}
尤其是文档注释:
The dictionary is special, though: it's made just for HTTP headers. According to RFC 7230, HTTP Header names are case-insensitive.
So, we can access the headers using any capitalization we want:
并继续解释有关 RFC 合规性的更多聪明之处。
Using Response.iter_content will handle a lot of what you would otherwise have to handle when using Response.raw directly. When streaming a download, the above is the preferred and recommended way to retrieve the content.
举个例子:
>>> r = requests.get('https://api.github.com/events', stream=True)
>>> r.raw
<requests.packages.urllib3.response.HTTPResponse object at 0x101194810>
>>> r.raw.read(10)
'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03'
但也提供了有关如何通过重定向到文件等并使用不同方法在实践中执行此操作的建议:
Using Response.iter_content will handle a lot of what you would otherwise have to handle when using Response.raw directly
这样的事情怎么样:
import urllib2
req = urllib2.Request('http://www.google.com/')
res = urllib2.urlopen(req)
print res.info()
res.close();
如果您正在寻找 header 中的特定内容:
For Date: print res.info().get('Date')
我正在使用 urllib 模块,代码如下:
from urllib import request
with request.urlopen(url, data) as f:
print(f.getcode()) # http response code
print(f.info()) # all header info
resp_body = f.read().decode('utf-8') # response body
尝试使用req.headers
,仅此而已。您将收到回复 headers ;)
容易
import requests
site = "https://www.google.com"
headers = requests.get(site).headers
print(headers)
如果你想要一些特别的东西
print(headers["domain"])
以下是如何使用您提到的请求库仅响应headers(Python3中的实现):
import requests
url = "https://www.google.com"
response = requests.head(url)
print(response.headers) # prints the entire header as a dictionary
print(response.headers["Content-Length"]) # prints a specific section of the dictionary
使用 .head()
而不是 .get()
很重要,否则您将像提到的其余答案一样检索整个 file/page。
如果您希望检索需要身份验证的 URL,您可以将上面的 response
替换为:
response = requests.head(url, auth=requests.auth.HTTPBasicAuth(username, password))
你可以很容易地输入
print(response.headers)
或者我的最爱
print(requests.get('url').headers)
also u can use
print(requests.get('url').content)