Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions zeroize/src/aarch64.rs
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
//! [`Zeroize`] impls for ARM64 SIMD registers.

use crate::{Zeroize, atomic_fence, volatile_write};

use crate::{Zeroize, optimization_barrier};
use core::arch::aarch64::*;
use core::ptr;

macro_rules! impl_zeroize_for_simd_register {
($($type:ty),* $(,)?) => {
$(
impl Zeroize for $type {
#[inline]
fn zeroize(&mut self) {
volatile_write(self, unsafe { core::mem::zeroed() });
atomic_fence();
unsafe { ptr::write_bytes(self, 0, 1) };
optimization_barrier(self);
}
}
)+
Expand Down
179 changes: 97 additions & 82 deletions zeroize/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -252,14 +252,13 @@ mod x86;

use core::{
marker::{PhantomData, PhantomPinned},
mem::{MaybeUninit, size_of},
mem::MaybeUninit,
num::{
self, NonZeroI8, NonZeroI16, NonZeroI32, NonZeroI64, NonZeroI128, NonZeroIsize, NonZeroU8,
NonZeroU16, NonZeroU32, NonZeroU64, NonZeroU128, NonZeroUsize,
},
ops, ptr,
slice::IterMut,
sync::atomic,
};

#[cfg(feature = "alloc")]
Expand Down Expand Up @@ -299,8 +298,8 @@ where
Z: DefaultIsZeroes,
{
fn zeroize(&mut self) {
volatile_write(self, Z::default());
atomic_fence();
unsafe { ptr::write_bytes(self, 0, 1) }
optimization_barrier(self);
}
}

Expand Down Expand Up @@ -329,12 +328,8 @@ macro_rules! impl_zeroize_for_non_zero {
$(
impl Zeroize for $type {
fn zeroize(&mut self) {
const ONE: $type = match <$type>::new(1) {
Some(one) => one,
None => unreachable!(),
};
volatile_write(self, ONE);
atomic_fence();
*self = <$type>::MAX;
optimization_barrier(self);
}
}
)+
Expand Down Expand Up @@ -411,9 +406,9 @@ where
//
// The memory pointed to by `self` is valid for `size_of::<Self>()` bytes.
// It is also properly aligned, because `u8` has an alignment of `1`.
unsafe {
volatile_set((self as *mut Self).cast::<u8>(), 0, size_of::<Self>());
}
unsafe { ptr::write_bytes(self, 0, 1) };

optimization_barrier(self);
Comment on lines -414 to +411
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer the removal of volatile writes be done in a separate PR, so this PR does what it says on the label: replaces atomic_fence with optimization_barrier.

The removal of volatile writes removes the longstanding defense this crate has been built on in the past, and I think that much deserves to be in the commit message (and I would also prefer it be spelled out nicely, and not jammed in like "replace atomic_fence with optimization_barrier and removes volatile writes".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I think we should probably leave the volatile writes in place unless we're using asm!. Otherwise we're relying on black_box as the only line of defense.

Copy link
Member Author

@newpavlov newpavlov Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer the removal of volatile writes be done in a separate PR

Ok. Will do it tomorrow.

Otherwise we're relying on black_box as the only line of defense.

Well. AFAIK it works without issues on the mainline compiler. Yes, we don't have any hard guarantees for it, but, as we discussed above, it applies to the asm! approach as well. IIRC black_box gets ignored on an alternative compiler (I forgot the name), but I don't think it supports the target arches in question in the first place and I am not sure how it handles volatile OPs.

We could discuss potential hardening approaches for black_box, but I would strongly prefer to concentrate on optimization_barrier doing what we want instead of piling different hacks on top of each other. In other words, I want for the snippet in this comment to be practice worthy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's unacceptable for black_box to be the only line of defense

Copy link
Member

@tarcieri tarcieri Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would strongly prefer to concentrate on optimization_barrier doing what we want instead of piling different hacks on top of each other.

I also think this is a gross overstatement of the situation. This PR does not notably make the codebase smaller. What it does do is delete a lot of the SAFETY comments that describe the strategy the library uses.

Had you just changed volatile_write and volatile_set to call ptr::write_bytes, the PR itself could've included considerably fewer changes. They exist so there's a single consistent strategy used throughout the library, and as a place to document how that strategy works.

Adding a little bit of compile-time gating to implement those functions in terms of either volatile or non-volatile depending on if an asm! barrier is in-place is neither particularly complicated nor "piling different hacks on top of each other".

It's a question of "we get the guarantee from a volatile write" vs "we get the guarantee from an asm! optimization barrier" (which is, itself, perhaps overstating the situation). black_box has no guarantees. You would need to delete all the documentation that says this library has guarantees, and that's a change I don't want to make.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said, we could look into ways to harden black_box. For example, see here. We also could write:

pub fn observe<R>(val: &R) {
    let _ = unsafe{ core::ptr::read_volatile(val) };
}

But, as expected, it results in a pretty atrocious codegen. We could improve it a bit by casting the reference to [usize; N] if size allows, but the #[inline(never)] solution looks better to me. Finally, some of the leftover targets support asm! on Nigthly, so we could add an unstable feature for them.


// Ensures self is overwritten with the `None` bit pattern. volatile_write can't be
// used because Option<Z> is not copy.
Expand All @@ -423,9 +418,7 @@ where
// self is safe to replace with `None`, which the take() call above should have
// already done semantically. Any value which needed to be dropped will have been
// done so by take().
unsafe { ptr::write_volatile(self, None) }

atomic_fence();
unsafe { ptr::write_volatile(self, None) };
}
}

Expand All @@ -438,10 +431,8 @@ impl<Z> ZeroizeOnDrop for Option<Z> where Z: ZeroizeOnDrop {}
/// [`MaybeUninit`] removes all invariants.
impl<Z> Zeroize for MaybeUninit<Z> {
fn zeroize(&mut self) {
// Safety:
// `MaybeUninit` is valid for any byte pattern, including zeros.
unsafe { ptr::write_volatile(self, MaybeUninit::zeroed()) }
atomic_fence();
*self = MaybeUninit::zeroed();
optimization_barrier(self);
}
}

Expand All @@ -456,18 +447,13 @@ impl<Z> Zeroize for MaybeUninit<Z> {
/// [`MaybeUninit`] removes all invariants.
impl<Z> Zeroize for [MaybeUninit<Z>] {
fn zeroize(&mut self) {
let ptr = self.as_mut_ptr().cast::<MaybeUninit<u8>>();
let size = self.len().checked_mul(size_of::<Z>()).unwrap();
assert!(size <= isize::MAX as usize);

// Safety:
//
// This is safe, because every valid pointer is well aligned for u8
// This is safe, because every valid pointer is well aligned for `MaybeUninit<Z>`
// and it is backed by a single allocated object for at least `self.len() * size_pf::<Z>()` bytes.
// and 0 is a valid value for `MaybeUninit<Z>`
// The memory of the slice should not wrap around the address space.
unsafe { volatile_set(ptr, MaybeUninit::zeroed(), size) }
atomic_fence();
unsafe { ptr::write_bytes(self.as_mut_ptr(), 0, self.len()) }
optimization_barrier(self);
}
}

Expand All @@ -484,16 +470,10 @@ where
Z: DefaultIsZeroes,
{
fn zeroize(&mut self) {
assert!(self.len() <= isize::MAX as usize);

// Safety:
//
// This is safe, because the slice is well aligned and is backed by a single allocated
// object for at least `self.len()` elements of type `Z`.
// `self.len()` is also not larger than an `isize`, because of the assertion above.
// The memory of the slice should not wrap around the address space.
unsafe { volatile_set(self.as_mut_ptr(), Z::default(), self.len()) };
atomic_fence();
for val in self.iter_mut() {
*val = Default::default();
}
optimization_barrier(self);
}
}

Expand Down Expand Up @@ -749,47 +729,6 @@ where
}
}

/// Use fences to prevent accesses from being reordered before this
/// point, which should hopefully help ensure that all accessors
/// see zeroes after this point.
#[inline(always)]
fn atomic_fence() {
atomic::compiler_fence(atomic::Ordering::SeqCst);
}

/// Perform a volatile write to the destination
#[inline(always)]
fn volatile_write<T: Copy + Sized>(dst: &mut T, src: T) {
unsafe { ptr::write_volatile(dst, src) }
}

/// Perform a volatile `memset` operation which fills a slice with a value
///
/// Safety:
/// The memory pointed to by `dst` must be a single allocated object that is valid for `count`
/// contiguous elements of `T`.
/// `count` must not be larger than an `isize`.
/// `dst` being offset by `size_of::<T> * count` bytes must not wrap around the address space.
/// Also `dst` must be properly aligned.
#[inline(always)]
unsafe fn volatile_set<T: Copy + Sized>(dst: *mut T, src: T, count: usize) {
// TODO(tarcieri): use `volatile_set_memory` when stabilized
for i in 0..count {
// Safety:
//
// This is safe because there is room for at least `count` objects of type `T` in the
// allocation pointed to by `dst`, because `count <= isize::MAX` and because
// `dst.add(count)` must not wrap around the address space.
let ptr = unsafe { dst.add(i) };

// Safety:
//
// This is safe, because the pointer is valid and because `dst` is well aligned for `T` and
// `ptr` is an offset of `dst` by a multiple of `size_of::<T>()` bytes.
unsafe { ptr::write_volatile(ptr, src) };
}
}

/// Zeroizes a flat type/struct. Only zeroizes the values that it owns, and it does not work on
/// dynamically sized values or trait objects. It would be inefficient to use this function on a
/// type that already implements `ZeroizeOnDrop`.
Expand Down Expand Up @@ -839,15 +778,91 @@ unsafe fn volatile_set<T: Copy + Sized>(dst: *mut T, src: T, count: usize) {
/// ```
#[inline(always)]
pub unsafe fn zeroize_flat_type<F: Sized>(data: *mut F) {
let size = size_of::<F>();
// Safety:
//
// This is safe because `size_of<T>()` returns the exact size of the object in memory, and
// `data_ptr` points directly to the first byte of the data.
unsafe {
volatile_set(data as *mut u8, 0, size);
ptr::write_bytes(data, 0, 1);
}
atomic_fence()
optimization_barrier(&data);
}

/// Observe the referenced data and prevent the compiler from removing previous writes to it.
///
/// This function acts like [`core::hint::black_box`] but takes a reference and
/// does not return the passed value.
///
/// It's implemented using the [`core::arch::asm!`] macro on target arches where `asm!` is stable,
/// i.e. `aarch64`, `arm`, `arm64ec`, `loongarch64`, `riscv32`, `riscv64`, `s390x`, `x86`, and
/// `x86_64`. On all other targets it's implemented using [`core::hint::black_box`].
///
/// # Examples
/// ```ignore
/// use core::num::NonZeroU32;
/// use zeroize::{ZeroizeOnDrop, zeroize_flat_type};
///
/// struct DataToZeroize {
/// buf: [u8; 32],
/// pos: NonZeroU32,
/// }
///
/// struct SomeMoreFlatData(u64);
///
/// impl Drop for DataToZeroize {
/// fn drop(&mut self) {
/// self.buf = [0u8; 32];
/// self.pos = NonZeroU32::new(32).unwrap();
/// zeroize::optimization_barrier(self);
/// }
/// }
///
/// impl zeroize::ZeroizeOnDrop for DataToZeroize {}
///
/// let mut data = DataToZeroize {
/// buf: [3u8; 32],
/// pos: NonZeroU32::new(32).unwrap(),
/// };
///
/// // data gets zeroized when dropped
/// ```
fn optimization_barrier<R: ?Sized>(val: &R) {
#[cfg(all(
not(miri),
any(
target_arch = "aarch64",
target_arch = "arm",
target_arch = "arm64ec",
target_arch = "loongarch64",
target_arch = "riscv32",
target_arch = "riscv64",
target_arch = "s390x",
target_arch = "x86",
target_arch = "x86_64",
)
))]
unsafe {
core::arch::asm!(
"# {}",
in(reg) val as *const R as *const (),
options(readonly, preserves_flags, nostack),
);
}
#[cfg(not(all(
not(miri),
any(
target_arch = "aarch64",
target_arch = "arm",
target_arch = "arm64ec",
target_arch = "loongarch64",
target_arch = "riscv32",
target_arch = "riscv64",
target_arch = "s390x",
target_arch = "x86",
target_arch = "x86_64",
)
)))]
core::hint::black_box(val);
}

/// Internal module used as support for `AssertZeroizeOnDrop`.
Expand Down
7 changes: 4 additions & 3 deletions zeroize/src/x86.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
//! [`Zeroize`] impls for x86 SIMD registers

use crate::{Zeroize, atomic_fence, volatile_write};
use crate::{Zeroize, optimization_barrier};
use core::ptr;

#[cfg(target_arch = "x86")]
use core::arch::x86::*;
Expand All @@ -14,8 +15,8 @@ macro_rules! impl_zeroize_for_simd_register {
impl Zeroize for $type {
#[inline]
fn zeroize(&mut self) {
volatile_write(self, unsafe { core::mem::zeroed() });
atomic_fence();
unsafe { ptr::write_bytes(self, 0, 1) };
optimization_barrier(self);
}
}
)*
Expand Down
2 changes: 1 addition & 1 deletion zeroize/tests/zeroize.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ fn non_zero() {
($($type:ty),+) => {
$(let mut value = <$type>::new(42).unwrap();
value.zeroize();
assert_eq!(value.get(), 1);)+
assert_eq!(value, <$type>::MAX);)+
};
}

Expand Down