Rust 是否优化了按值传递的临时结构?

Does Rust optimize passing temporary structures by value?

假设我有一个 Rust 结构向量。结构相当大。当我想插入一个新的时,我这样写代码:

my_vec.push(MyStruct {field1: value1, field2: value2, ... });

推送定义为

fn push(&mut self, value: T)

表示值按值传递。我想知道 Rust 是先创建一个临时对象然后复制到 push 函数还是优化代码以便不创建和复制临时对象?

让我们看看。 This program:

struct LotsOfBytes {
    bytes: [u8; 1024]
}

#[inline(never)]
fn consume(mut lob: LotsOfBytes) {
}

fn main() {
    let lob = LotsOfBytes { bytes: [0; 1024] };
    consume(lob);
}

编译为以下 LLVM IR 代码:

%LotsOfBytes = type { [1024 x i8] }

; Function Attrs: noinline nounwind uwtable
define internal fastcc void @_ZN7consume20hf098deecafa4b74bkaaE(%LotsOfBytes* noalias nocapture dereferenceable(1024)) unnamed_addr #0 {
entry-block:
  %1 = getelementptr inbounds %LotsOfBytes* %0, i64 0, i32 0, i64 0
  tail call void @llvm.lifetime.end(i64 1024, i8* %1)
  ret void
}

; Function Attrs: nounwind uwtable
define internal void @_ZN4main20hf3cbebd3154c5390qaaE() unnamed_addr #2 {
entry-block:
  %lob = alloca %LotsOfBytes, align 8
  %lob1 = getelementptr inbounds %LotsOfBytes* %lob, i64 0, i32 0, i64 0
  %arg = alloca %LotsOfBytes, align 8
  %0 = getelementptr inbounds %LotsOfBytes* %lob, i64 0, i32 0, i64 0
  call void @llvm.lifetime.start(i64 1024, i8* %0)
  call void @llvm.memset.p0i8.i64(i8* %lob1, i8 0, i64 1024, i32 8, i1 false)
  %1 = getelementptr inbounds %LotsOfBytes* %arg, i64 0, i32 0, i64 0
  call void @llvm.lifetime.start(i64 1024, i8* %1)
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %1, i8* %0, i64 1024, i32 8, i1 false)
  call fastcc void @_ZN7consume20hf098deecafa4b74bkaaE(%LotsOfBytes* noalias nocapture dereferenceable(1024) %arg)
  call void @llvm.lifetime.end(i64 1024, i8* %1)
  call void @llvm.lifetime.end(i64 1024, i8* %0)
  ret void
}

这一行特别有意思:

call fastcc void @_ZN7consume20hf098deecafa4b74bkaaE(%LotsOfBytes* noalias nocapture dereferenceable(1024) %arg)

如果我理解正确,这意味着 consume 被调用时带有指向 LotsOfBytes 的指针,所以是的,rustc 优化了按值传递大结构。