仅使用一个非常大的预分配数组的一小部分

Question

当我们在Fortran或C中分配数组时，我的理解是内存首先分配在所谓的虚拟内存中，而物理内存只有在我们写入数据到（部分）数组（例如，基于此 page）。这是否意味着，如果我们分配一个非常大的数组（比如 10^9 个元素）并且只使用它的一小部分（比如前 10^6 个元素），我们是否只需要后者的物理内存？如果是这样，利用这个特性在一个非常大的预分配数组中容纳未知（但不是太大）大小的数据实际上没有问题吗？

例如，下面的 Fortran 代码首先分配一个大小为 10^9 的大数组，将数据写入前 10^6 个元素，然后执行重新分配以调整数组大小。

integer, allocatable :: a(:)
integer :: nlarge, nsmall, i

nlarge = 1000000000  !! 10^9
nsmall =    1000000  !! 10^6 

allocate( a( nlarge ) )   !! allocate in virtual memory

print *, "after allocation"
call system( "ps aux | grep a.out" )

do i = 1, nsmall
    a( i ) = i     !! write actual data (we assume that "nsmall" is not known a priori)
enddo

print *, "after assignment"
call system( "ps aux | grep a.out" )

a = a( 1 : nsmall )    !! adjust the array size by reallocation

print *, "after reallocation"
call system( "ps aux | grep a.out" )

我机器上的输出（Linux x86_64 使用 gfortran）是

after allocation
username    29064  0.0  0.0 3914392  780 pts/3    S+   01:15   0:00 ./a.out
after assignment
username    29064  0.0  0.0 3914392 5188 pts/3    S+   01:15   0:00 ./a.out
after reallocation
username    29064  0.0  0.0  12048  4692 pts/3    S+   01:15   0:00 ./a.out

这表明只使用了大约 5 MB 的物理内存。是否可以利用此功能来容纳未知大小（但低于物理内存大小）的临时数据？

编辑

更具体地说，我假设的系统是一个典型的工作站运行 Linux x86_64（例如CentOS），具有数十GB RAM，并且程序是用Fortran编写的。这个问题的动机是，当我希望将未知大小的数据存储到数组中时，我通常需要以某种方式知道它的大小并适当地分配一个数组。但是，除非我们有一个内置的动态数组，否则这种方法有点乏味。通常，这种情况发生在两种情况下： (1) 当从包含未知大小数据的外部文件中读取数据时； (2) 当一个人通过多维循环收集符合特定条件的数据时。在情况 1 中，我们通常扫描文件两次（一次获取数据大小，然后读取数据），或者预先分配一个足够大的数组作为缓冲区。因此，我对虚拟内存系统是否通过允许分配非常大的数组（无需过多关心大小）来帮助简化此任务感兴趣。

然而，通过更多的实验，我知道这种方法是相当有限的...例如，如果我如下更改数组的大小，ifort 抱怨上面的 "insufficient virtual memory" ~ 80 GB，这可能对应于我系统上物理内存 + 交换区域的总和。所以，虽然"ulimit -a"说虚拟内存是"unlimited"，但实际上好像不是无限的...

! compiled with: ifort -heap-arrays -assume realloc_lhs 
use iso_fortran_env, only: long => int64
integer, allocatable :: a(:)
integer(long) :: nlarge, nsmall, i

! nlarge = 10_long**9   !! OK: 4 GB                                              
! nlarge = 10_long**10   !! OK: 40 GB                                            
nlarge = 2 * 10_long**10   !! OK: 80 GB                                         
! nlarge = 3 * 10_long**10   !! NG: insufficient virtual memory (120 GB)         
! nlarge = 4 * 10_long**10   !! NG: insufficient virtual memory (160 GB)         
! nlarge = 10_long**11    ! NG: insufficient virtual memory (400 GB)             

nsmall = 10**6   !! 4 MB

结论：使用传统方法似乎更好（即，分配具有必要大小的数组，或根据需要重复重新分配可分配数组，或使用用户定义的动态数组）。对于这个微不足道的结论，我很抱歉...

Answer 1

When we allocate an array in Fortran or C, my understanding is that the memory is first allocated in the so-called virtual memory, while the physical memory is allocated only when we write data onto (some part of) the array.

这是您的OS可能选择做的一件事。不能保证它实际上不会为所有保留内存添加实际页面 table 条目，映射并因此拥有物理内存。

事实上，C 或 Fortran 不会告诉您有关如何获得内存、内存从何而来或 OS 如何处理让您获得内存的任何信息。您混淆了您的语言指定的内容，您的标准库如何处理请求内存以及底层 OS 实际上如何将物理内存映射到进程地址 space – 事实上，在没有 MMU 的系统上（内存管理单位），你可以运行完美的 C 代码，但所有内存地址实际上都是物理地址。

Does this mean that, if we allocate a very large array (say 10^9 elements) and use only a fraction of it (say the first 10^6 elements), do we need only the physical memory for the latter?

对此要小心一点。同样，这取决于 OS 来实现。 OS 可能（并且通常会）实现 "lazy mapping" 功能，但仍然不会为您提供比实际可用内存更多的内存。

此外，请记住至少对于 32 位 OSes，存在相关的内存 space 限制：32 位进程不能拥有超过 2GB 的内存 space，这意味着您根本不能拥有 10^9 个 32 位整数！

If so, is it practically no problem to utilize this feature to accommodate data of unknown (but not too large) size in a very large, pre-allocated array?

这实际上是一个问题，因为 OS 可能不会给你那么多内存。此外，除了那时花费的时间之外，以后获得更多内存并没有真正的缺点（参见 realloc standard C function）； OS 将必须找到空闲页面并将它们映射到您的进程 space，并可能重新映射以前的页面。

仅使用一个非常大的预分配数组的一小部分

Using only a fraction of a very large pre-allocated array

fortran

memory-management

allocation