Feature gate: #![feature(strict_provenance)]
read the docs
get the stable polyfill
subtasks
This is a tracking issue for the strict_provenance feature. This is a standard library feature that governs the following APIs:
IMPORTANT: This is purely a set of library APIs to make your code more clear/reliable, so that we can better understand what Rust code is actually trying to do and what it actually needs help with. It is overwhelmingly framed as a memory model because we are doing a bit of Roleplay here. We are roleplaying that this is a real memory model and seeing what code doesn't conform to it already. Then we are seeing how trivial it is to make that code "conform".
This cannot and will not "break your code" because the lang and compiler teams are wholy uninvolved with this. Your code cannot be "run under strict provenance" because there isn't a compiler flag for "enabling" it. Although it would be nice to have a lint to make it easier to quickly migrate code that wants to play along.
This is an unofficial experiment to see How Bad it would be if Rust had extremely strict pointer provenance rules that require you to always dynamically preserve provenance information. Which is to say if you ever want to treat something as a Real Pointer that can be Offset and Dereferenced, there must be an unbroken chain of custody from that pointer to the original allocation you are trying to access using only pointer->pointer operations. If at any point you turn a pointer into an integer, that integer cannot be turned back into a pointer. This includes usize as ptr, transmute, type punning with raw pointer reads/writes, whatever. Just assume the memory "knows" it contains a pointer and that writing to it as a non-pointer makes it forget (because this is quite literally true on CHERI and miri, which are immediate beneficiaries of doing this).
A secondary goal of this project is to try to disambiguate the many meanings of ptr as usize, in the hopes that it might make it plausible/tolerable to allow usize to be redefined to be an address-sized integer instead of a pointer-sized integer. This would allow for Rust to more natively support platforms where sizeof(size_t) < sizeof(intptr_t), and effectively redefine usize from intptr_t to size_t/ptrdiff_t/ptraddr_t (it would still generally conflate those concepts, absent a motivation to do otherwise). To the best of my knowledge this would not have a practical effect on any currently supported platforms, and just allow for more platforms to be supported (certainly true for our tier 1 platforms).
A tertiary goal of this project is to more clearly answer the question "hey what's the deal with Rust on architectures that are pretty harvard-y like AVR and WASM (platforms which treat function pointers and data pointers non-uniformly)". There is... weirdness in the language because it's difficult to talk about "some" function pointer generically/opaquely and that encourages you to turn them into data pointers and then maybe that does Wrong Things.
The mission statement of this experiment is: assume it will and must work, try to make code conform to it, smash face-first into really nasty problems that need special consideration, and try to actually figure out how to handle those situations. We want the evil shit you do with pointers to work but the current situation leads to incredibly broken results, so something has to give.
Public API
This design is roughly based on the article Rust's Unsafe Pointer Types Need An Overhaul, which is itself based on the APIs that CHERI exposes for dynamically maintaining provenance information even under Fun Bit Tricks.
The core piece that makes this at all plausible is pointer::with_addr(self, usize) -> Self which dynamically re-establishes the provenance chain of custody. Everything else introduced is sugar or alternatives to as casts that better express intent.
More APIs may be introduced as we explore the feature space.
// core::ptr
pub fn invalid<T>(addr: usize) -> *const T;
pub fn invalid_mut<T>(addr: usize) -> *mut T;
// core::pointer
pub fn addr(self) -> usize;
pub fn with_addr(self, addr: usize) -> Self;
pub fn map_addr(self, f: impl FnOnce(usize) -> usize) -> Self;
Steps / History
Unresolved Questions
-
How Bad Is This?
-
How Good Is This?
-
What's Problematic (And Should Work)?
-
What's Problematic (And Might Be Impossible)?
-
APIs We Want To Add/Change?
- A lot of uses of .addr() are for alignment checks,
.is_aligned(), .is_aligned_to(usize)?
- An API to make ZST alloc forging explicit,
exists_zst(usize)?
.addr() should arguably work on a DST, if you use .addr() you are ostensibly saying "I know this doesn't roundtrip"
- Explicit conveniences for low-bit tagging?
.with_tag(TAG)?
expose_addr/from_exposed_addr are slightly unfortunate names since it's not the address that gets exposed, it's the provenance. What would be better names? Please discuss on Zulip.
- It is somewhat unfortunate that
addr is the short and easy name for the operation that programmers likely expect less. (Many will expect expose_addr semantics.) Maybe it should have a different name. But which name?
Feature gate:
#![feature(strict_provenance)]read the docs
get the stable polyfill
subtasks
This is a tracking issue for the
strict_provenancefeature. This is a standard library feature that governs the following APIs:pointer::addrpointer::with_addrpointer::map_addrcore::ptr::invalidcore::ptr::invalid_mutThis is an unofficial experiment to see How Bad it would be if Rust had extremely strict pointer provenance rules that require you to always dynamically preserve provenance information. Which is to say if you ever want to treat something as a Real Pointer that can be Offset and Dereferenced, there must be an unbroken chain of custody from that pointer to the original allocation you are trying to access using only pointer->pointer operations. If at any point you turn a pointer into an integer, that integer cannot be turned back into a pointer. This includes
usize as ptr,transmute, type punning with raw pointer reads/writes, whatever. Just assume the memory "knows" it contains a pointer and that writing to it as a non-pointer makes it forget (because this is quite literally true on CHERI and miri, which are immediate beneficiaries of doing this).A secondary goal of this project is to try to disambiguate the many meanings of
ptr as usize, in the hopes that it might make it plausible/tolerable to allowusizeto be redefined to be an address-sized integer instead of a pointer-sized integer. This would allow for Rust to more natively support platforms wheresizeof(size_t) < sizeof(intptr_t), and effectively redefineusizefromintptr_ttosize_t/ptrdiff_t/ptraddr_t(it would still generally conflate those concepts, absent a motivation to do otherwise). To the best of my knowledge this would not have a practical effect on any currently supported platforms, and just allow for more platforms to be supported (certainly true for our tier 1 platforms).A tertiary goal of this project is to more clearly answer the question "hey what's the deal with Rust on architectures that are pretty harvard-y like AVR and WASM (platforms which treat function pointers and data pointers non-uniformly)". There is... weirdness in the language because it's difficult to talk about "some" function pointer generically/opaquely and that encourages you to turn them into data pointers and then maybe that does Wrong Things.
The mission statement of this experiment is: assume it will and must work, try to make code conform to it, smash face-first into really nasty problems that need special consideration, and try to actually figure out how to handle those situations. We want the evil shit you do with pointers to work but the current situation leads to incredibly broken results, so something has to give.
Public API
This design is roughly based on the article Rust's Unsafe Pointer Types Need An Overhaul, which is itself based on the APIs that CHERI exposes for dynamically maintaining provenance information even under Fun Bit Tricks.
The core piece that makes this at all plausible is
pointer::with_addr(self, usize) -> Selfwhich dynamically re-establishes the provenance chain of custody. Everything else introduced is sugar or alternatives toascasts that better express intent.More APIs may be introduced as we explore the feature space.
Steps / History
Unresolved Questions
How Bad Is This?
How Good Is This?
What's Problematic (And Should Work)?
volatileaccess#[repr(transparent)] OpaqueFnPtr(fn() -> ())type in std, need a way to talk about e.g. dlopen.with something like llvm's proposed byte type
What's Problematic (And Might Be Impossible)?
APIs We Want To Add/Change?
.is_aligned(),.is_aligned_to(usize)?exists_zst(usize)?.addr()should arguably work on a DST, if you use.addr()you are ostensibly saying "I know this doesn't roundtrip".with_tag(TAG)?expose_addr/from_exposed_addrare slightly unfortunate names since it's not the address that gets exposed, it's the provenance. What would be better names? Please discuss on Zulip.addris the short and easy name for the operation that programmers likely expect less. (Many will expectexpose_addrsemantics.) Maybe it should have a different name. But which name?