JavaScript 闭包如何在底层工作?

How do JavaScript closures work at a low level?

我理解闭包定义为:

[A] stack-frame which is not deallocated when the function returns. (as if a 'stack-frame' were malloc'ed instead of being on the stack!)

但我不明白这个答案如何适合 JavaScript 的存储机制。解释器如何跟踪这些值?浏览器的存储机制是不是像Heap和Stack一样分段的?

关于这个问题的回答:How do JavaScript closures work? 解释说:

[A] function reference also has a secret reference to the closure

这个神秘的背后机制是什么"secret reference?"

编辑 许多人说这取决于实现,因此为了简单起见,请在特定实现的上下文中提供解释。

这是一个示例,说明如何将需要闭包的代码转换为不需要的代码。需要注意的重点是:函数声明如何转换,函数调用如何转换,以及对已移入堆的局部变量的访问如何转换。

输入:

var f = function (x) {
  x = x + 10
  var g = function () {
    return ++x
  }
  return g
}

var h = f(3)
console.log(h()) // 14
console.log(h()) // 15

输出:

// Header that goes at the top of the program:

// A list of environments, starting with the one
// corresponding to the innermost scope.
function Envs(car, cdr) {
  this.car = car
  this.cdr = cdr
}

Envs.prototype.get = function (k) {
    var e = this
    while (e) {
        if (e.car.get(k)) return e.car.get(k)
        e = e.cdr
    }
    // returns undefined if lookup fails
}

Envs.prototype.set = function (k, v) {
    var e = this
    while (e) {
        if (e.car.get(k)) {
            e.car.set(k, v)
            return this
        }
        e = e.cdr
    }
    throw new ReferenceError()
}

// Initialize the global scope.
var envs = new Envs(new Map(), null)

// We have to use this special function to call our closures.
function call(f, ...args) {
    return f.func(f.envs, ...args)
}

// End of header.

var f = {
    func: function (envs, x) {
        envs = new Envs(new Map().set('x',x), envs)

        envs.set('x', envs.get('x') + 10))
        var g = {
            func: function (envs) {
                envs = new Envs(new Map(), envs)
                return envs.set('x', envs.get('x') + 1).get('x')
            },
            envs: envs
        }
        return g
    },
    envs: envs
}

var h = call(f, 3)
console.log(call(h)) // 14
console.log(call(h)) // 15

让我们来分解一下这三个关键的转变是如何进行的。对于函数声明的情况,为具体起见,假设我们有一个包含两个参数 xy 以及一个局部变量 zxz 的函数可以逃脱堆栈框架,因此需要移动到堆中。由于提升,我们可以假设 z 在函数的开头声明。

输入:

var f = function f(x, y) {
    var z = 7
    ...
}

输出:

var f = {
    func: function f(envs, x, y) {
        envs = new Envs(new Map().set('x',x).set('z',7), envs)
        ...
    }
    envs: envs
}

这是棘手的部分。其余的转换只包括使用 call 调用函数并用 envs.

中的查找替换对移动到堆中的变量的访问

一些注意事项。

  1. 我们怎么知道 xz 需要移动到堆而不是 y?答案:最简单(但可能不是最佳)的事情是将任何内容移动到封闭函数体中引用的堆中。

  2. 我给出的实现会泄漏大量内存并需要函数调用来访问移动到堆而不是内联的局部变量。真正的实现不会做这些事情。

最后,user3856986 发布的答案与我的假设有所不同,所以让我们比较一下。

主要区别在于我假设局部变量将保存在传统堆栈中,而 user3856986 的答案只有在堆栈将作为堆上的某种结构实现时才有意义(但他或她不是对这个要求非常明确)。像这样的堆实现可以工作,尽管它会给分配器和 GC 带来更多负载,因为您必须在堆上分配和收集堆栈帧。使用现代 GC 技术,这可能比您想象的更有效,但我相信常用的 VM 确实使用传统堆栈。

此外,在 user3856986 的回答中还有一些模糊之处是闭包如何获取对相关堆栈框架的引用。在我的代码中,当 envs 属性 在执行堆栈帧时在闭包上设置时会发生这种情况。

最后,user3856986 写道,"All variables in b() become local variables to c() and nothing else. The function that called c() has no access to them." 这有点误导。给定对闭包 c 的引用,唯一阻止通过调用 b 访问闭包变量的是类型系统。人们当然可以从汇编中访问这些变量(否则,c 如何访问它们?)。另一方面,至于 c 的真正局部变量,在指定 c 的某些特定调用之前询问您是否可以访问它们甚至没有意义(如果我们考虑一些特定的调用,当控制权返回给调用者时,存储在其中的信息可能已经被销毁了。

slebetman's answer to the question javascript can't access private properties 的这一部分很好地回答了您的问题。

The Stack:

A scope is related to the stack frame (in Computer Science it's called the "activation record" but most developers familiar with C or assembly know it better as stack frame). A scope is to a stack frame what a class is to an object. By that I mean that where an object is an instance of a class, a stack frame is an instance of scope.

Let's use a made-up language as an example. In this language, like in javascript, functions define scope. Lets take a look at an example code:

var global_var

function b {
    var bb
}

function a {
    var aa
    b();
}

When we read the code above, we say that the variable aa is in scope in function a and the variable bb is in scope in function b. Note that we don't call this thing private variables. Because the opposite of private variables are public variables and both refer to properties bound to objects. Instead we call aa and bb local variables. The opposite of local variables are global variables (not public variables).

Now, let's see what happens when we call a:

a() gets called, create a new stack frame. Allocate space for local variables on the stack:

The stack:
 ┌────────┐
 │ var aa │ <── a's stack frame
 ╞════════╡
 ┆        ┆ <── caller's stack frame

a() calls b(), create a new stack frame. Allocate space for local variables on the stack:

The stack:
 ┌────────┐
 │ var bb │ <── b's stack frame
 ╞════════╡
 │ var aa │
 ╞════════╡
 ┆        ┆

In most programming languages, and this includes javascript, a function only has access to its own stack frame. Thus a() cannot access local variables in b() and neither can any other function or code in global scope access variables in a(). The only exception are variables in global scope. From an implementation point of view this is achieved by allocating global variables in an area of memory that does not belong to the stack. This is generally called the heap. So to complete the picture the memory at this point looks like this:

The stack:     The heap:
 ┌────────┐   ┌────────────┐
 │ var bb │   │ global_var │
 ╞════════╡   │            │
 │ var aa │   └────────────┘
 ╞════════╡
 ┆        ┆

(as a side note, you can also allocate variables on the heap inside functions using malloc() or new)

Now b() completes and returns, it's stack frame is removed from the stack:

The stack:     The heap:
 ┌────────┐   ┌────────────┐
 │ var aa │   │ global_var │
 ╞════════╡   │            │
 ┆        ┆   └────────────┘

and when a() completes the same happens to its stack frame. This is how local variables gets allocated and freed automatically - via pushing and popping objects off the stack.

Closures:

A closure is a more advanced stack frame. But whereas normal stack frames gets deleted once a function returns, a language with closures will merely unlink the stack frame (or just the objects it contains) from the stack while keeping a reference to the stack frame for as long as it's required.

Now let's look at an example code of a language with closures:

function b {
    var bb
    return function {
        var cc
    }
}

function a {
    var aa
    return b()
}

Now let's see what happens if we do this:

var c = a()

First function a() is called which in turn calls b(). Stack frames are created and pushed onto the stack:

The stack:
 ┌────────┐
 │ var bb │
 ╞════════╡
 │ var aa │
 ╞════════╡
 │ var c  │
 ┆        ┆

Function b() returns, so it's stack frame is popped off the stack. But, function b() returns an anonymous function which captures bb in a closure. So we pop off the stack frame but don't delete it from memory (until all references to it has been completely garbage collected):

The stack:             somewhere in RAM:
 ┌────────┐           ┌╶╶╶╶╶╶╶╶╶┐
 │ var aa │           ┆ var bb  ┆
 ╞════════╡           └╶╶╶╶╶╶╶╶╶┘
 │ var c  │
 ┆        ┆

a() now returns the function to c. So the stack frame of the call to b() gets linked to the variable c. Note that it's the stack frame that gets linked, not the scope. It's kind of like if you create objects from a class it's the objects that gets assigned to variables, not the class:

The stack:             somewhere in RAM:
 ┌────────┐           ┌╶╶╶╶╶╶╶╶╶┐
 │ var c╶╶├╶╶╶╶╶╶╶╶╶╶╶┆ var bb  ┆
 ╞════════╡           └╶╶╶╶╶╶╶╶╶┘
 ┆        ┆

Also note that since we haven't actually called the function c(), the variable cc is not yet allocated anywhere in memory. It's currently only a scope, not yet a stack frame until we call c().

Now what happens when we call c()? A stack frame for c() is created as normal. But this time there is a difference:

The stack:
 ┌────────┬──────────┐
 │ var cc    var bb  │  <──── attached closure
 ╞════════╤──────────┘
 │ var c  │
 ┆        ┆

The stack frame of b() is attached to the stack frame of c(). So from the point of view of function c() it's stack also contains all the variables that were created when function b() was called (Note again, not the variables in function b() but the variables created when function b() was called - in other words, not the scope of b() but the stack frame created when calling b(). The implication is that there is only one possible function b() but many calls to b() creating many stack frames).

But the rules of local and global variables still applies. All variables in b() become local variables to c() and nothing else. The function that called c() has no access to them.

What this means is that when you redefine c in the caller's scope like this:

var c = function {/* new function */}

this happens:

                     somewhere in RAM:
                           ┌╶╶╶╶╶╶╶╶╶┐
                           ┆ var bb  ┆
                           └╶╶╶╶╶╶╶╶╶┘
The stack:
 ┌────────┐           ┌╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶┐
 │ var c╶╶├╶╶╶╶╶╶╶╶╶╶╶┆ /* new function */ ┆
 ╞════════╡           └╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶┘
 ┆        ┆

As you can see, it's impossible to regain access to the stack frame from the call to b() since the scope that c belongs to doesn't have access to it.

我写过一篇关于这个话题的文章:How do JavaScript closures work under the hood:图文并茂的解释。

要理解这个主题,我们需要知道范围对象(或 LexicalEnvironments)是如何分配、使用和删除的。这种理解是了解全局和了解闭包在幕后如何工作的关键。

我不打算在这里重新输入整篇文章,但作为一个简短的例子,考虑这个脚本:

"use strict";

var foo = 1;
var bar = 2;

function myFunc() {
  //-- define local-to-function variables
  var a = 1;
  var b = 2;
  var foo = 3;
}

//-- and then, call it:
myFunc();

执行顶层代码时,作用域对象的排列如下:

请注意 myFunc 引用了两者:

  • 函数对象(包含代码和任何其他公开可用的属性)
  • 作用域对象,它在定义时间函数时处于活动状态。

而当 myFunc() 被调用时,我们有以下作用域链:

调用函数时,将创建新的作用域对象并用于扩充作用域链myFunc引用.当我们定义一些内部函数,然后在外部函数之外调用它时,它可以让我们达到非常强大的效果。

看前面的文章,里面解释的很详细