HTTP 响应抛出错误 gzip:无效 header
HTTP response throw error gzip: invalid header
不明白哪里出了问题。 ioutil.ReadAll 应该像其他 URL 一样使用 gzip。
可以用URL重现:romboutskorea.co.kr
错误:
gzip: invalid header
代码:
resp, err := http.Get("http://" + url)
if err == nil {
defer resp.Body.Close()
if resp.StatusCode == http.StatusOK {
fmt.Printf("HTTP Response Status : %v\n", resp.StatusCode)
bodyBytes, err := ioutil.ReadAll(resp.Body)
if err != nil {
fmt.Printf("HTTP Response Read error. Url: %v\n", url)
log.Fatal(err)
}
bodyString := string(bodyBytes)
fmt.Printf("HTTP Response Content Length : %v\n", len(bodyString))
}
}
明白了
Content-Type: text/html; charset=euc-kr
Content-Encoding: gzip
检查正文内容:,它可能是一个 HTTP 响应,其中正文首先使用 gzip 压缩,然后使用分块传输编码进行编码。
一个NewChunkedReader
would be needed, as in this example.
本站的回复有误。它声称是 gzip 编码,但实际上并没有压缩内容。响应看起来像这样:
HTTP/1.1 200 OK
...
Content-Encoding: gzip
...
Transfer-Encoding: chunked
Content-Type: text/html; charset=euc-kr
8000
<html>
<head>
...
“8000”来自分块传输编码,但“...”是未分块响应正文的开头。显然这没有被压缩,尽管它声称是这样。
看起来浏览器只是通过忽略错误的编码规范来绕过这个损坏的站点。浏览器实际上解决了很多损坏的东西,这些东西并没有真正增加提供者解决这些问题的动力:(但是你可以看到 curl
将失败:
$ curl -v --compressed http://romboutskorea.co.kr/main/index.php?
...
< HTTP/1.1 200 OK
< ...
< Content-Encoding: gzip
< ...
< Transfer-Encoding: chunked
< Content-Type: text/html; charset=euc-kr
<
* Error while processing content unencoding: invalid code lengths set
* Failed writing data
* Curl_http_done: called premature == 1
* Closing connection 0
curl: (23) Error while processing content unencoding: invalid code lengths set
Python也是如此:
$ python3 -c 'import requests; requests.get("http://romboutskorea.co.kr/main/index.php?")'
...
requests.exceptions.ContentDecodingError: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing data: incorrect header check'))
不明白哪里出了问题。 ioutil.ReadAll 应该像其他 URL 一样使用 gzip。
可以用URL重现:romboutskorea.co.kr
错误:
gzip: invalid header
代码:
resp, err := http.Get("http://" + url)
if err == nil {
defer resp.Body.Close()
if resp.StatusCode == http.StatusOK {
fmt.Printf("HTTP Response Status : %v\n", resp.StatusCode)
bodyBytes, err := ioutil.ReadAll(resp.Body)
if err != nil {
fmt.Printf("HTTP Response Read error. Url: %v\n", url)
log.Fatal(err)
}
bodyString := string(bodyBytes)
fmt.Printf("HTTP Response Content Length : %v\n", len(bodyString))
}
}
明白了
Content-Type: text/html; charset=euc-kr
Content-Encoding: gzip
检查正文内容:
一个NewChunkedReader
would be needed, as in this example.
本站的回复有误。它声称是 gzip 编码,但实际上并没有压缩内容。响应看起来像这样:
HTTP/1.1 200 OK
...
Content-Encoding: gzip
...
Transfer-Encoding: chunked
Content-Type: text/html; charset=euc-kr
8000
<html>
<head>
...
“8000”来自分块传输编码,但“...”是未分块响应正文的开头。显然这没有被压缩,尽管它声称是这样。
看起来浏览器只是通过忽略错误的编码规范来绕过这个损坏的站点。浏览器实际上解决了很多损坏的东西,这些东西并没有真正增加提供者解决这些问题的动力:(但是你可以看到 curl
将失败:
$ curl -v --compressed http://romboutskorea.co.kr/main/index.php?
...
< HTTP/1.1 200 OK
< ...
< Content-Encoding: gzip
< ...
< Transfer-Encoding: chunked
< Content-Type: text/html; charset=euc-kr
<
* Error while processing content unencoding: invalid code lengths set
* Failed writing data
* Curl_http_done: called premature == 1
* Closing connection 0
curl: (23) Error while processing content unencoding: invalid code lengths set
Python也是如此:
$ python3 -c 'import requests; requests.get("http://romboutskorea.co.kr/main/index.php?")'
...
requests.exceptions.ContentDecodingError: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing data: incorrect header check'))