Firebase 托管的云函数重试任何需要 60 秒的请求,即使超时 >60 秒也是如此
Firebase-Hosted Cloud Function retrying on any request that takes 60s, even when timeout is >60s
我有一个普通的云函数,它需要 60 秒,然后 returns 状态 200 带有一个简单的 JSON 对象。该函数的超时设置为 150 秒。在本地测试时,当 运行 通过它的 cloudfunctions.net 地址调用该函数时,该函数在 60 秒时完成,并且 200 响应和正文已正确传递给客户端。到目前为止一切顺利。
这是关键——如果我 运行 通过 firebase 托管代理的完全相同的函数(通过 firebase.json 内的 "target" 设置),根据 stackdriver 日志,该函数在 1-3 次的任何地方立即重新启动,并且当这些完成时,该功能有时会再次重新启动,最终从 Varnish 返回 503 超时。
只有在通过 firebase 托管代理的域上调用该函数时,此行为才能始终如一地复制。它似乎只在函数需要 ~60 秒或更长时间时才会发生。不依赖于返回的响应码或响应体。
您可以在我在此处设置的测试函数中看到此行为:https://trellisconnect.com/testtimeout?sleepAmount=60&retCode=200
此行为最初是在通过无服务器部署的函数中发现的。为了排除无服务器,我创建了一个测试函数,使测试和验证行为变得容易,并使用常规 firebase 函数部署它,并从它的 cloudfunctions.net 域调用它,并验证我总是在 60 秒时得到正确的响应。然后我更新了我的 firebase.json 以添加指向此函数的新路由并且能够重现该问题。
index.js
function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
exports.testtimeout = functions.https.onRequest((req, res) => {
const { sleepAmount, retCode } = req.query;
console.log(`starting test sleeping ${sleepAmount}...`);
sleep(1000 * sleepAmount).then(result => {
console.log(`Ending test func, returning ${retCode}`);
return res.status(retCode).json({ message: 'Random Response' });
});
});
firebase.json
{
"hosting": {
"public": "public",
"ignore": ["firebase.json", "**/.*", "**/node_modules/**"],
"rewrites": [
{
"source": "/testtimeout",
"function": "testtimeout"
}
]
},
"functions": {}
}
</snip>
一个correct/expected响应(sleepAmount=2秒)
zgoldberg@zgblade:~$ time curl "https://trellisconnect.com/testtimeout?sleepAmount=2&retCode=200"
{"message":"Random Response"}
real 0m2.269s
user 0m0.024s
sys 0m0.000s
以及将 sleepAmount 设置为 60 秒时出现的情况示例
zgoldberg@zgblade:~$ curl -v "https://trellisconnect.com/testtimeout?sleepAmount=60&retCode=200"
* Trying 151.101.65.195...
* TCP_NODELAY set
* Connected to trellisconnect.com (151.101.65.195) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: CN=admin.cliquefood.com.br
* start date: Oct 16 20:44:55 2019 GMT
* expire date: Jan 14 20:44:55 2020 GMT
* subjectAltName: host "trellisconnect.com" matched cert's "trellisconnect.com"
* issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
* SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x563f92bdc580)
> GET /testtimeout?sleepAmount=60&retCode=200 HTTP/2
> Host: trellisconnect.com
> User-Agent: curl/7.58.0
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 503
< server: Varnish
< retry-after: 0
< content-type: text/html; charset=utf-8
< accept-ranges: bytes
< date: Fri, 08 Nov 2019 03:12:08 GMT
< x-served-by: cache-bur17523-BUR
< x-cache: MISS
< x-cache-hits: 0
< x-timer: S1573182544.115433,VS0,VE184552
< content-length: 449
<
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<title>503 first byte timeout</title>
</head>
<body>
<h1>Error 503 first byte timeout</h1>
<p>first byte timeout</p>
<h3>Guru Mediation:</h3>
<p>Details: cache-bur17523-BUR 1573182729 2301023220</p>
<hr>
<p>Varnish cache server</p>
</body>
</html>
* Connection #0 to host trellisconnect.com left intact
real 3m3.763s
user 0m0.024s
sys 0m0.031s
这是疯狂的部分,检查 stackdriver 日志,注意函数如何在 60 秒内完成,并且几乎是在 3 次以上的执行开始后立即完成...
请注意,原始调用在 19:09:04.235 处进入并在 19:10:04.428 处结束——几乎整整 60 秒之后。几乎恰好 500 毫秒后,19:10:05.925 函数重新启动。我向你保证,我不会在初始响应 0.5 秒后再次点击我的 curl 命令。 None 这里函数的后续执行是我生成的,它们似乎都是幻影重试?
https://i.imgur.com/WDY17pw.png
(编辑:我没有 post 实际图像的 10 声望,所以只有上面的 link)
非常感谢任何想法或帮助
From Firebase Hosting: Serving Dynamic Content with Cloud Functions for Firebase:
Note: Firebase Hosting is subject to a 60-second request timeout. Even if you configure your HTTP function with a longer request timeout, you'll still receive an HTTP status code 504
(request timeout) if your function requires more than 60 seconds to run. To support dynamic content that requires longer compute time, consider using an App Engine flexible environment.
简而言之,很遗憾,您的用例不受支持,因为 CDN/Hosting 实例只是假设连接丢失并重试。
我有一个普通的云函数,它需要 60 秒,然后 returns 状态 200 带有一个简单的 JSON 对象。该函数的超时设置为 150 秒。在本地测试时,当 运行 通过它的 cloudfunctions.net 地址调用该函数时,该函数在 60 秒时完成,并且 200 响应和正文已正确传递给客户端。到目前为止一切顺利。
这是关键——如果我 运行 通过 firebase 托管代理的完全相同的函数(通过 firebase.json 内的 "target" 设置),根据 stackdriver 日志,该函数在 1-3 次的任何地方立即重新启动,并且当这些完成时,该功能有时会再次重新启动,最终从 Varnish 返回 503 超时。
只有在通过 firebase 托管代理的域上调用该函数时,此行为才能始终如一地复制。它似乎只在函数需要 ~60 秒或更长时间时才会发生。不依赖于返回的响应码或响应体。
您可以在我在此处设置的测试函数中看到此行为:https://trellisconnect.com/testtimeout?sleepAmount=60&retCode=200
此行为最初是在通过无服务器部署的函数中发现的。为了排除无服务器,我创建了一个测试函数,使测试和验证行为变得容易,并使用常规 firebase 函数部署它,并从它的 cloudfunctions.net 域调用它,并验证我总是在 60 秒时得到正确的响应。然后我更新了我的 firebase.json 以添加指向此函数的新路由并且能够重现该问题。
index.js
function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
exports.testtimeout = functions.https.onRequest((req, res) => {
const { sleepAmount, retCode } = req.query;
console.log(`starting test sleeping ${sleepAmount}...`);
sleep(1000 * sleepAmount).then(result => {
console.log(`Ending test func, returning ${retCode}`);
return res.status(retCode).json({ message: 'Random Response' });
});
});
firebase.json
{
"hosting": {
"public": "public",
"ignore": ["firebase.json", "**/.*", "**/node_modules/**"],
"rewrites": [
{
"source": "/testtimeout",
"function": "testtimeout"
}
]
},
"functions": {}
}
</snip>
一个correct/expected响应(sleepAmount=2秒)
zgoldberg@zgblade:~$ time curl "https://trellisconnect.com/testtimeout?sleepAmount=2&retCode=200"
{"message":"Random Response"}
real 0m2.269s
user 0m0.024s
sys 0m0.000s
以及将 sleepAmount 设置为 60 秒时出现的情况示例
zgoldberg@zgblade:~$ curl -v "https://trellisconnect.com/testtimeout?sleepAmount=60&retCode=200"
* Trying 151.101.65.195...
* TCP_NODELAY set
* Connected to trellisconnect.com (151.101.65.195) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: CN=admin.cliquefood.com.br
* start date: Oct 16 20:44:55 2019 GMT
* expire date: Jan 14 20:44:55 2020 GMT
* subjectAltName: host "trellisconnect.com" matched cert's "trellisconnect.com"
* issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
* SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x563f92bdc580)
> GET /testtimeout?sleepAmount=60&retCode=200 HTTP/2
> Host: trellisconnect.com
> User-Agent: curl/7.58.0
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 503
< server: Varnish
< retry-after: 0
< content-type: text/html; charset=utf-8
< accept-ranges: bytes
< date: Fri, 08 Nov 2019 03:12:08 GMT
< x-served-by: cache-bur17523-BUR
< x-cache: MISS
< x-cache-hits: 0
< x-timer: S1573182544.115433,VS0,VE184552
< content-length: 449
<
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<title>503 first byte timeout</title>
</head>
<body>
<h1>Error 503 first byte timeout</h1>
<p>first byte timeout</p>
<h3>Guru Mediation:</h3>
<p>Details: cache-bur17523-BUR 1573182729 2301023220</p>
<hr>
<p>Varnish cache server</p>
</body>
</html>
* Connection #0 to host trellisconnect.com left intact
real 3m3.763s
user 0m0.024s
sys 0m0.031s
这是疯狂的部分,检查 stackdriver 日志,注意函数如何在 60 秒内完成,并且几乎是在 3 次以上的执行开始后立即完成...
请注意,原始调用在 19:09:04.235 处进入并在 19:10:04.428 处结束——几乎整整 60 秒之后。几乎恰好 500 毫秒后,19:10:05.925 函数重新启动。我向你保证,我不会在初始响应 0.5 秒后再次点击我的 curl 命令。 None 这里函数的后续执行是我生成的,它们似乎都是幻影重试?
https://i.imgur.com/WDY17pw.png (编辑:我没有 post 实际图像的 10 声望,所以只有上面的 link)
非常感谢任何想法或帮助
From Firebase Hosting: Serving Dynamic Content with Cloud Functions for Firebase:
Note: Firebase Hosting is subject to a 60-second request timeout. Even if you configure your HTTP function with a longer request timeout, you'll still receive an HTTP status code
504
(request timeout) if your function requires more than 60 seconds to run. To support dynamic content that requires longer compute time, consider using an App Engine flexible environment.
简而言之,很遗憾,您的用例不受支持,因为 CDN/Hosting 实例只是假设连接丢失并重试。