使用 node 和 firebase 遍历数千条记录

Question

我的 Firebase 数据存储中有大约 10,000 条记录，每条记录都附加了一些数据，例如。

productName: {
 price: 10.00,
 lastChecked: timestamp,
 url: 'http://product/url',
 imagePath: 'http://product/image/url'
}

我遍历每个产品，然后检索每个产品数据，然后执行其他任务。

当我只有几百条记录时，我已经全部正常工作了，但现在我有数千条（还有更多），当我运行任务崩溃时 CPU 过载，大多数产品无法执行任务。

我已经阅读了有关循环阻塞的内容并尝试了回调中的超时，我在几篇文章中阅读了这些内容，这些内容有所改进，但尚未设法防止服务器 CPU 过载。

这是我从另一篇文章中实现的示例。

        getProductData = function(product, callback){
            ref.child('products/'+product).once('value', function(snapshot) {
                    callback(snapshot.val(), product);
                });
            },

        queryProductData = function(product){
            getProductData(product, function (productData, productKey) {                    
                 setTimeout(scrapeProductDetails(product), 2000) //queue for next ping in the next predefined interval
            });
        },

        productLoop = function(productsList) {                
            for (var product in productsList)
            {
                setTimeout(queryProductData(product), 2000) //queue job. Every 2 seconds, query_host will be called.
            }
        }

这是运行ning 作为节点服务而不是网站，因此将运行ning 在后台。

Answer 1

关于这个位：

for (var product in productsList)
{
    setTimeout(queryProductData(product), 2000)
}

2 处不太正确的地方：

通过执行 setTimeout(queryProductData(product), 2000)，您已经运行在计时器开始前启动了该功能。查看 bind 来解决这个问题。
for 循环一次遍历每个产品并创建计时器，因此每个计时器将同时启动。结果：for 循环后 2 秒，所有函数将同时运行。所以你基本上仍然同时做所有事情，但你增加了 2 秒的延迟。

您可能想要这样的结构：

index = 0
function nextProduct() {
    productName = productsList[index] // get current product from list

    // Do what you need with productName

    index++ // Next product
}

setInterval(nextProduct, 2000);

nextProduct每次调用都会从列表中获取下一个产品，setInterval会每2秒重复调用nextProduct。

警告：如果 nextProduct 需要超过 2 秒才能同步运行，index 可能不会被更新调用下一个函数的时间，因此最好在使用它获取产品名称后立即更新 index，而不是像我的示例那样实际在最后更新。

另一种解决方案是让 nextProduct 在完成时调用自身，而不是使用 setInterval。但是，在使用递归函数时，您还需要克服其他问题（如堆栈大小限制），因此我建议您不要使用递归函数。

希望我的回答对你有所帮助，如有不妥欢迎评论，我再看一看

使用 node 和 firebase 遍历数千条记录

Looping through thousands of records using node and firebase

javascript

performance

node.js

firebase