Pytorch：为什么`tensor`变量占用的内存这么小？

Question

在Pytorch 1.0.0中，我发现一个tensor变量占用的内存很小。我想知道它是如何存储这么多数据的。这是代码。

a = np.random.randn(1, 1, 128, 256)
b = torch.tensor(a, device=torch.device('cpu'))

a_size = sys.getsizeof(a)
b_size = sys.getsizeof(b)

a_size是262288。b_size是72。

Answer 1

答案分为两部分。来自sys.getsizeof的文档，首先是

All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

所以张量 __sizeof__ 可能未定义或定义与您预期的不同 - 此函数不是您可以依赖的。其次

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

这意味着如果 torch.Tensor 对象仅持有对实际内存的引用，则不会显示在 sys.getsizeof 中。确实如此，如果您改为检查基础 storage 的大小，您将看到预期的数字

import torch, sys
b = torch.randn(1, 1, 128, 256, dtype=torch.float64)
sys.getsizeof(b)
>> 72
sys.getsizeof(b.storage())
>> 262208

注意：我明确地将 dtype 设置为 float64，因为这是 numpy 中的默认 dtype，而 torch 使用 float32 默认。

Pytorch: Why is the memory occupied by the `tensor` variable so small?