Rust mutable reference and `noalias`
起因
最近在看 tokio 源代码的时候看到一段代码
/// `tokio/src/util/linked_list.rs`
/// We do not want the compiler to put the `noalias` attribute on mutable
/// references to this type, so the type has been made `!Unpin` with a
/// `PhantomPinned` field.
///
/// Additionally, we never access the `prev` or `next` fields directly, as any
/// such access would implicitly involve the creation of a reference to the
/// field, which we want to avoid since the fields are not `!Unpin`, and would
/// hence be given the `noalias` attribute if we were to do such an access.
/// As an alternative to accessing the fields directly, the `Pointers` type
/// provides getters and setters for the two fields, and those are implemented
/// using raw pointer casts and offsets, which is valid since the struct is
/// #[repr(C)].
///
/// See this link for more information:
/// <https://github.com/rust-lang/rust/pull/82834>
其中提到了为了不让编译器对&mut
生成noalias
属性,使用了 PhantomPined。 看到这里就有一些疑惑,什么是noalias
,以及编译器为什么会对&mut
生成 noalias
的属性。
noalias
在 LLVM 中noalias 是函数的Parameter Attribute
noalias的官方解释是
This indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value. This guarantee only holds for memory locations that are modified, by any means, during the execution of the function. The attribute on a return value also has additional semantics described below. The caller shares the responsibility with the callee for ensuring that these requirements are met. For further details, please see the discussion of the NoAlias response in alias analysis.
简而言之,就是在函数的执行过程中,带有 noalias 标记的指针具有对被指向地址访问的唯一性。如果两个参数/返回值都被标记为 noalias,那么这两个指针指向的地址必须不一样。 另外这里还提到了指针based关系的(定义)[https://llvm.org/docs/LangRef.html#pointeraliasing]。 其中访问一个结构体的subfield 也属于 based 关系。
rustc 对&mut
生成的代码
来一个例子体会一下rustc 生成的 llvm 代码
use black_box;
这里要在 release 模式下才会生成noalias的 attribute
; a::hello
; Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(argmem: readwrite) uwtable
define internal fastcc void @_ZN1a5hello17hda125466b804c551E(ptr noalias nocapture noundef align 4 dereferenceable(4) %x, ptr noalias nocapture noundef align 4 dereferenceable(4) %y) unnamed_addr #4 {
start:
%_4 = load i32, ptr %x, align 4, !noundef !4
%_3 = icmp ugt i32 %_4, 100
%0 = load i32, ptr %y, align 4
br i1 %_3, label %bb1, label %bb3
bb1: ; preds = %start
%1 = add i32 %0, 1
store i32 %1, ptr %y, align 4
br label %bb3
bb3: ; preds = %start, %bb1
%_6 = phi i32 [ %1, %bb1 ], [ %0, %start ]
%_5 = icmp ugt i32 %_6, 1
br i1 %_5, label %bb4, label %bb6
bb4: ; preds = %bb3
%2 = add i32 %_4, 100
store i32 %2, ptr %x, align 4
br label %bb6
bb6: ; preds = %bb3, %bb4
ret void
}
Unsafe rust 与noalias
在这个issue之后,rust 默认对函数的参数和返回值中的&mut 生成 noalias 标注。这样的话在写 unsafe rust 的时候要注意了,在cast 一个指针到&mut 的时候要考虑这个&mut 是否会被用作函数参数。 例如下面的代码片段在 debug 模式(没有生成 noalias)和 release 模式下(生成 noalias)的结果完全不同.
use black_box;
避免生成 noalias
根据 PR #82834
noalias is not emitted for types that are !Unpin, as a heuristic for self-referential structures (see #54878 and #63818).
如果代码中有很多地方用到了 unsafe ,同时又不想每次函数调用去检查是否会出现上面提到的问题,可以在对应数据结构中内置一个 PhatonmPinned 的 Marker type 。
use PhantomPinned;
对hello
函数生成的 llvm ir,没有 noalias
; hello
; Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory uwtable
define internal fastcc void @_ZN1a5hello17he6f9e03264499f38E unnamed_addr #4
对world
函数生成的 llvm ir,依然有 noalias
; world
; Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory uwtable
define internal fastcc void @_ZN1a5world17hec9a993fc9bac6c2E unnamed_addr #4
所以在 tokio 的链表中,虽然 PointersInner 结构不会生成 noalias,但如果对 prev 和 next 定义 get_mut函数的话依然会对 prev和 next生成 noalias,对此 tokio 的做法是#repr[(C)]
使用 C 的内存布局,定义 get 和 set 函数,直接操作裸指针。
}