Nginx upstream 在从上游读取响应 header 时过早关闭连接，用于大请求

Question

我正在使用 nginx 和节点服务器来处理更新请求。请求更新大数据时出现网关超时。我从 nginx 错误日志中看到了这个错误：

2016/04/07 00:46:04 [error] 28599#0: *1 upstream prematurely closed connection while reading response header from upstream, client: 10.0.2.77, server: gis.oneconcern.com, request: "GET /update_mbtiles/atlas19891018000415 HTTP/1.1", upstream: "http://127.0.0.1:7777/update_mbtiles/atlas19891018000415", host: "gis.oneconcern.com"

我用谷歌搜索错误并尝试了所有可能的方法，但我仍然遇到错误。

我的 nginx conf 有这些代理设置：

    ##
    # Proxy settings
    ##

    proxy_connect_timeout 1000;
    proxy_send_timeout 1000;
    proxy_read_timeout 1000;
    send_timeout 1000;

我的服务器是这样配置的

server {
listen 80;

server_name gis.oneconcern.com;
access_log /home/ubuntu/Tilelive-Server/logs/nginx_access.log;
error_log /home/ubuntu/Tilelive-Server/logs/nginx_error.log;

large_client_header_buffers 8 32k;
location / {
    proxy_pass http://127.0.0.1:7777;
    proxy_redirect off;

    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection 'upgrade';
    proxy_set_header Host $http_host;
    proxy_cache_bypass $http_upgrade;
}

location /faults {
    proxy_pass http://127.0.0.1:8888;
    proxy_http_version 1.1;
    proxy_buffers 8 64k;
    proxy_buffer_size 128k;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection 'upgrade';
    proxy_set_header Host $host;
    proxy_cache_bypass $http_upgrade;
}

}

我正在使用 nodejs 后端来处理 aws 服务器上的请求。仅当更新需要很长时间（大约 3-4 分钟）时才会出现网关错误。对于较小的更新，我没有收到任何错误。任何帮助将不胜感激。

节点js代码：

app.get("/update_mbtiles/:earthquake", function(req, res){
var earthquake = req.params.earthquake
var command = spawn(__dirname + '/update_mbtiles.sh', [ earthquake, pg_details ]);
//var output  = [];

command.stdout.on('data', function(chunk) {
//    logger.info(chunk.toString());
//     output.push(chunk.toString());
});

command.stderr.on('data', function(chunk) {
  //  logger.error(chunk.toString());
 //   output.push(chunk.toString());
});

command.on('close', function(code) {
    if (code === 0) {
        logger.info("updating mbtiles successful for " + earthquake);
        tilelive_reload_and_switch_source(earthquake);
        res.send("Completed updating!");
    }
    else {
        logger.error("Error occured while updating " + earthquake);
        res.status(500);
        res.send("Error occured while updating " + earthquake);
    }
});
});

function tilelive_reload_and_switch_source(earthquake_unique_id) {
tilelive.load('mbtiles:///'+__dirname+'/mbtiles/tipp_out_'+ earthquake_unique_id + '.mbtiles', function(err, source) {
    if (err) {
        logger.error(err.message);
        throw err;
    }
    sources.set(earthquake_unique_id, source); 
    logger.info('Updated source! New tiles!');
});
}

谢谢。

Answer 1

我认为来自 Nginx 的错误表明连接已被您的 nodejs 服务器关闭（即 "upstream"）。 nodejs是怎么配置的？

Answer 2

我通过为代理设置更高的超时值解决了这个问题：

location / {
    proxy_read_timeout 300s;
    proxy_connect_timeout 75s;
    proxy_pass http://localhost:3000;
}

文档：https://nginx.org/en/docs/http/ngx_http_proxy_module.html

Answer 3

您可以像这样增加节点中的超时。

app.post('/slow/request', function(req, res) {
    req.connection.setTimeout(100000); //100 seconds
    ...
}

Answer 4

我有很长一段时间都遇到同样的错误，这里是为我解决的问题。

我只是在服务中声明我使用以下内容：

Description= Your node service description
After=network.target

[Service]
Type=forking
PIDFile=/tmp/node_pid_name.pid
Restart=on-failure
KillSignal=SIGQUIT
WorkingDirectory=/path/to/node/app/root/directory
ExecStart=/path/to/node /path/to/server.js

[Install]
WantedBy=multi-user.target

这里应该引起您注意的是"After=network.target"。我花了很多天在 nginx 方面寻找修复，而问题就是这样。可以肯定的是，停止运行您拥有的节点服务，直接启动 ExecStart 命令并尝试重现错误。如果它不弹出，则说明您的服务有问题。至少我是这样找到答案的。

祝大家好运！

Answer 5

我不认为这是你的情况，但如果它对任何人有帮助，我会post。我有同样的问题，问题是 Node 根本没有响应（我有一个条件，当失败时没有做任何事情 - 所以没有响应） - 所以如果增加所有超时没有解决它，请确保所有场景都有响应。

Answer 6

我遇到了同样的问题，这里详述的解决方案没有一个对我有用... 首先我有一个错误 413 Entity too large 所以我更新了我的 nginx.conf 如下：

http {
        # Increase request size
        client_max_body_size 10m;

        ##
        # Basic Settings
        ##

        sendfile on;
        tcp_nopush on;
        tcp_nodelay on;
        keepalive_timeout 65;
        types_hash_max_size 2048;
        # server_tokens off;

        # server_names_hash_bucket_size 64;
        # server_name_in_redirect off;

        include /etc/nginx/mime.types;
        default_type application/octet-stream;

        ##
        # SSL Settings
        ##

        ssl_protocols TLSv1 TLSv1.1 TLSv1.2; # Dropping SSLv3, ref: POODLE
        ssl_prefer_server_ciphers on;

        ##
        # Logging Settings
        ##

        access_log /var/log/nginx/access.log;
        error_log /var/log/nginx/error.log;

        ##
        # Gzip Settings
        ##

        gzip on;

        # gzip_vary on;
        # gzip_proxied any;
        # gzip_comp_level 6;
        # gzip_buffers 16 8k;
        # gzip_http_version 1.1;
        # gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

        ##
        # Virtual Host Configs
        ##

        include /etc/nginx/conf.d/*.conf;
        include /etc/nginx/sites-enabled/*;

        ##
        # Proxy settings
        ##
        proxy_connect_timeout 1000;
        proxy_send_timeout 1000;
        proxy_read_timeout 1000;
        send_timeout 1000;
}

所以我只更新了 http 部分，现在我遇到错误 502 Bad Gateway 并且当我显示 /var/log/nginx/error.log 时我得到了著名的 "upstream prematurely closed connection while reading response header from upstream"

对我来说真正神秘的是，当我运行它在我的服务器上使用 virtualenv 并将请求发送到：IP:8000/nameOfTheRequest

时请求有效

感谢阅读

Answer 7

我遇到了同样的错误，这是我解决它的方法：

从 AWS 下载日志。
查看了 Nginx 日志，没有上面的其他详细信息。
已查看 node.js 日志，AccessDenied AWS SDK 权限错误。
检查了 AWS 试图从中读取的 S3 存储桶。
添加了具有读取权限的附加存储桶以更正服务器角色。

即使我处理的是大文件，在纠正丢失的 S3 访问权限后也没有其他错误或设置需要更改。

Answer 8

我运行也进入了这个问题并发现了这个 post。最终，这些答案中的 none 解决了我的问题，相反，我不得不放入一个重写规则来删除 location /rt，因为我的开发人员制作的后端并不期望任何额外的路径：

┌─(william@wkstn18)──(Thu, 05 Nov 20)─┐
└─(~)──(16:13)─>wscat -c ws://WebsocketServerHostname/rt
error: Unexpected server response: 502

用 wscat 反复测试给出了 502 响应。 Nginx 错误日志提供了与上面相同的上游错误，但请注意上游字符串显示 GET 请求正在尝试访问 localhost:12775/rt 而不是 localhost:12775:

 2020/11/05 22:13:32 [error] 10175#10175: *7 upstream prematurely closed
 connection while reading response header from upstream, client: WANIP,
 server: WebsocketServerHostname, request: "GET /rt/socket.io/?transport=websocket
 HTTP/1.1", upstream: "http://127.0.0.1:12775/rt/socket.io/?transport=websocket",
 host: "WebsocketServerHostname"

因为开发人员没有对他们的 websocket（监听 12775）进行编码以期待 /rt/socket.io 而是 /socket.io/（注意：/socket.io/ 似乎只是是一种指定 websocket t运行sport 讨论的方法 here)。因此，我没有要求他们重写套接字代码，而是 将重写规则放入 t运行slate WebsocketServerHostname/rt 到 WebsocketServerHostname:12775 作为以下：

upstream websocket-rt {
        ip_hash;

        server 127.0.0.1:12775;
}

server {
        listen 80;
        server_name     WebsocketServerHostname;

        location /rt {
                proxy_http_version 1.1;

                #rewrite /rt/ out of all requests and proxy_pass to 12775
                rewrite /rt/(.*) /  break;

                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                proxy_set_header Host $host;

                proxy_pass http://websocket-rt;
                proxy_set_header Upgrade $http_upgrade;
                proxy_set_header Connection $connection_upgrade;
        }

}

Answer 9

问题

上游服务器超时，我不知道发生了什么。

如果您的服务器正在连接到数据库，在增加读取或写入超时之前首先要查看的位置

服务器正在连接到数据库，该连接工作正常并且在合理的响应时间内，它不是导致服务器响应时间延迟的原因。

确保连接状态不会导致上游发生级联故障

那你可以搬家看看server和proxy的读写超时配置

Answer 10

我在尝试从 Nginx 代理的服务器下载 2GB 文件时偶然发现 *145660 upstream prematurely closed connection while reading upstream Nginx 错误日志条目。该消息表明“上游”关闭了连接，但实际上它与 proxy_max_temp_file_size 设置有关：

Syntax: proxy_max_temp_file_size size;
Default: proxy_max_temp_file_size 1024m;
Context: http, server, location

When buffering of responses from the proxied server is enabled, and the whole response does not fit into the buffers set by the proxy_buffer_size and proxy_buffers directives, a part of the response can be saved to a temporary file. This directive sets the maximum size of the temporary file. The size of data written to the temporary file at a time is set by the proxy_temp_file_write_size directive.

The zero value disables buffering of responses to temporary files.

This restriction does not apply to responses that will be cached or stored on disk.

症状：

下载在大约 1GB 时被强制停止，
Nginx 声称上游关闭连接，但没有代理服务器返回完整内容。

解决方法：

将代理位置的 proxy_max_temp_file_size 增加到 4096m 并开始发送完整内容。

Answer 11

我在我的 AWS Elastic Beanstalk 实例的日志中发现了这个错误，当时我试图 post 我的 api 大约一百万行。

我按照这里的所有建议都无济于事。

最终起作用的是将我的 EC2 实例的大小从 1 核和 1GB RAM 增加到 4 核和 8 GB RAM。

Answer 12

当您的代码进入循环时，也会发生此错误。因此，请调查您是否有任何（间接）self-referencing 代码导致此问题。

Nginx upstream 在从上游读取响应 header 时过早关闭连接，用于大请求

Nginx upstream prematurely closed connection while reading response header from upstream, for large requests

webserver

nginx

node.js