mirror of
https://github.com/danbulant/oxc
synced 2026-05-19 04:08:41 +00:00
docs(data_structures): improve docs for stack types (#8356)
Improve docs for `Stack`, `NonEmptyStack` and `SparseStack`.
This commit is contained in:
parent
fb389f724a
commit
e0a09ab023
4 changed files with 83 additions and 32 deletions
|
|
@ -1,8 +1,11 @@
|
|||
//! Contains the following FILO data structures:
|
||||
//! - [`Stack`]: A growable stack
|
||||
//! - [`SparseStack`]: A stack that can have empty entries
|
||||
//! - [`NonEmptyStack`]: A growable stack that can never be empty, allowing for more efficient
|
||||
//! operations
|
||||
//!
|
||||
//! * [`Stack`]: A growable stack, equivalent to [`Vec`], but more efficient for stack usage (push/pop).
|
||||
//! * [`NonEmptyStack`]: A growable stack that can never be empty, allowing for more efficient operations
|
||||
//! (very fast `last` / `last_mut`).
|
||||
//! * [`SparseStack`]: A growable stack of `Option`s, optimized for low memory usage when many entries in
|
||||
//! the stack are empty (`None`).
|
||||
|
||||
mod capacity;
|
||||
mod common;
|
||||
mod non_empty;
|
||||
|
|
|
|||
|
|
@ -9,44 +9,67 @@ use super::{NonNull, StackCapacity, StackCommon};
|
|||
|
||||
/// A stack which can never be empty.
|
||||
///
|
||||
/// `NonEmptyStack` is created initially with 1 entry, and `pop` does not allow removing it
|
||||
/// (though that initial entry can be mutated with `last_mut`).
|
||||
/// [`NonEmptyStack`] is created initially with 1 entry, and [`pop`] does not allow removing it
|
||||
/// (though that initial entry can be mutated with [`last_mut`]).
|
||||
///
|
||||
/// The fact that the stack is never empty makes all operations except `pop` infallible.
|
||||
/// `last` and `last_mut` are branchless.
|
||||
/// The fact that the stack is never empty makes all operations except [`pop`] infallible.
|
||||
/// [`last`] and [`last_mut`] are branchless.
|
||||
///
|
||||
/// The trade-off is that you cannot create a `NonEmptyStack` without allocating.
|
||||
/// The trade-off is that you cannot create a [`NonEmptyStack`] without allocating,
|
||||
/// and you must create an initial value for the "dummy" initial entry.
|
||||
/// If that is not a good trade-off for your use case, prefer [`Stack`], which can be empty.
|
||||
///
|
||||
/// [`NonEmptyStack`] is usually a better choice than [`Stack`], unless either:
|
||||
///
|
||||
/// 1. The stack will likely never have anything pushed to it.
|
||||
/// [`NonEmptyStack::new`] always allocates, whereas [`Stack::new`] does not.
|
||||
/// So if stack usually starts empty and remains empty, [`Stack`] will avoid an allocation.
|
||||
/// This is the same as how [`Vec`] does not allocate until you push a value into it.
|
||||
///
|
||||
/// 2. The type the stack holds is large or expensive to construct, so there's a high cost in having to
|
||||
/// create an initial dummy value (which [`NonEmptyStack`] requires, but [`Stack`] doesn't).
|
||||
///
|
||||
/// [`SparseStack`] may be preferable if the type you're storing is an `Option`.
|
||||
///
|
||||
/// To simplify implementation, zero size types are not supported (e.g. `NonEmptyStack<()>`).
|
||||
///
|
||||
/// ## Design
|
||||
/// Designed for maximally efficient `push`, `pop`, and reading/writing the last value on stack.
|
||||
/// Designed for maximally efficient [`push`], [`pop`], and reading/writing the last value on stack
|
||||
/// ([`last`] / [`last_mut`]).
|
||||
///
|
||||
/// The alternative would likely be to use a `Vec`. But `Vec` is optimized for indexing into at
|
||||
/// The alternative would likely be to use a [`Vec`]. But `Vec` is optimized for indexing into at
|
||||
/// arbitrary positions, not for `push` and `pop`. `Vec` stores `len` and `capacity` as integers,
|
||||
/// so requires pointer maths on every operation: `let entry_ptr = base_ptr + index * size_of::<T>();`.
|
||||
///
|
||||
/// In comparison, `NonEmptyStack` contains a `cursor` pointer, which always points to last entry
|
||||
/// In comparison, [`NonEmptyStack`] contains a `cursor` pointer, which always points to last entry
|
||||
/// on stack, so it can be read/written with a minimum of operations.
|
||||
///
|
||||
/// This design is similar to `std`'s slice iterator.
|
||||
/// This design is similar to [`std`'s slice iterators].
|
||||
///
|
||||
/// Comparison to `Vec`:
|
||||
/// * `last` and `last_mut` are 1 instruction, instead of `Vec`'s 4.
|
||||
/// * `pop` is 1 instruction shorter than `Vec`'s equivalent.
|
||||
/// * `push` is 1 instruction shorter than `Vec`'s equivalent, and uses 1 less register.
|
||||
/// Comparison to [`Vec`]:
|
||||
/// * [`last`] and [`last_mut`] are 1 instruction, instead of `Vec`'s 4.
|
||||
/// * [`pop`] is 1 instruction shorter than `Vec`'s equivalent.
|
||||
/// * [`push`] is 1 instruction shorter than `Vec`'s equivalent, and uses 1 less register.
|
||||
///
|
||||
/// ### Possible alternative designs
|
||||
/// 1. `cursor` could point to *after* last entry, rather than *to* it. This has advantage that `pop`
|
||||
/// uses 1 less register, but disadvantage that `last` and `last_mut` are 2 instructions, not 1.
|
||||
/// 1. `cursor` could point to *after* last entry, rather than *to* it. This has advantage that [`pop`]
|
||||
/// uses 1 less register, but disadvantage that [`last`] and [`last_mut`] are 2 instructions, not 1.
|
||||
/// <https://godbolt.org/z/xnx7YP5de>
|
||||
///
|
||||
/// 2. Stack could grow downwards, like `bumpalo` allocator does. This would probably make `pop` use
|
||||
/// 1 less register, but at the cost that the stack can never grow in place, which would incur more
|
||||
/// memory copies when the stack grows.
|
||||
/// 2. Stack could grow downwards, like `bumpalo` allocator does. This would probably make [`pop`] use
|
||||
/// 1 less register, but at the cost that: (a) the stack can never grow in place, which would incur
|
||||
/// more memory copies when the stack grows, and (b) [`as_slice`] would have the entries in
|
||||
/// reverse order.
|
||||
///
|
||||
/// [`push`]: NonEmptyStack::push
|
||||
/// [`pop`]: NonEmptyStack::pop
|
||||
/// [`last`]: NonEmptyStack::last
|
||||
/// [`last_mut`]: NonEmptyStack::last_mut
|
||||
/// [`as_slice`]: NonEmptyStack::as_slice
|
||||
/// [`Stack`]: super::Stack
|
||||
/// [`Stack::new`]: super::Stack::new
|
||||
/// [`SparseStack`]: super::SparseStack
|
||||
/// [`std`'s slice iterators]: std::slice::Iter
|
||||
pub struct NonEmptyStack<T> {
|
||||
/// Pointer to last entry on stack.
|
||||
/// Points *to* last entry, not *after* last entry.
|
||||
|
|
|
|||
|
|
@ -2,12 +2,16 @@ use super::{NonEmptyStack, Stack};
|
|||
|
||||
/// Stack which is sparsely filled.
|
||||
///
|
||||
/// Functionally equivalent to a stack implemented as `Vec<Option<T>>`, but more memory-efficient
|
||||
/// Functionally equivalent to [`NonEmptyStack<Option<T>>`], but more memory-efficient
|
||||
/// in cases where majority of entries in the stack will be empty (`None`).
|
||||
///
|
||||
/// It has the same advantages as [`NonEmptyStack`] in terms of [`last`] and [`last_mut`] being
|
||||
/// infallible and branchless, and with very fast lookup (without any pointer maths).
|
||||
/// [`SparseStack`]'s advantage over [`NonEmptyStack`] is less memory usage for empty entries (`None`).
|
||||
///
|
||||
/// Stack is initialized with a single entry which can never be popped off.
|
||||
/// If `Program` has a entry on the stack, can use this initial entry for it. Get value for `Program`
|
||||
/// in `exit_program` visitor with `SparseStack::take_last` instead of `SparseStack::pop`.
|
||||
/// in `exit_program` visitor with [`take_last`] instead of [`pop`].
|
||||
///
|
||||
/// The stack is stored as 2 arrays:
|
||||
/// 1. `has_values` - Records whether an entry on the stack has a value or not (`Some` or `None`).
|
||||
|
|
@ -19,12 +23,19 @@ use super::{NonEmptyStack, Stack};
|
|||
///
|
||||
/// e.g. if `T` is 24 bytes, and 90% of stack entries have no values:
|
||||
/// * `Vec<Option<T>>` is 24 bytes per entry (or 32 bytes if `T` has no niche).
|
||||
/// * `NonEmptyStack<Option<T>>` is same.
|
||||
/// * `SparseStack<T>` is 4 bytes per entry.
|
||||
///
|
||||
/// When the stack grows and reallocates, `SparseStack` has less memory to copy, which is a performance
|
||||
/// win too.
|
||||
///
|
||||
/// To simplify implementation, zero size types are not supported (`SparseStack<()>`).
|
||||
///
|
||||
/// [`last`]: SparseStack::last
|
||||
/// [`last_mut`]: SparseStack::last_mut
|
||||
/// [`take_last`]: SparseStack::take_last
|
||||
/// [`pop`]: SparseStack::pop
|
||||
/// [`NonEmptyStack<Option<T>>`]: NonEmptyStack
|
||||
pub struct SparseStack<T> {
|
||||
has_values: NonEmptyStack<bool>,
|
||||
values: Stack<T>,
|
||||
|
|
|
|||
|
|
@ -12,24 +12,38 @@ use super::{NonNull, StackCapacity, StackCommon};
|
|||
/// If a non-empty stack is viable for your use case, prefer [`NonEmptyStack`], which is cheaper for
|
||||
/// all operations.
|
||||
///
|
||||
/// [`NonEmptyStack`] is usually the better choice, unless:
|
||||
/// 1. You want `new()` not to allocate.
|
||||
/// 2. Creating initial value for `NonEmptyStack::new()` is expensive.
|
||||
/// [`NonEmptyStack`] is usually the better choice, unless either:
|
||||
///
|
||||
/// 1. The stack will likely never have anything pushed to it.
|
||||
/// [`NonEmptyStack::new`] always allocates, whereas [`Stack::new`] does not.
|
||||
/// So if stack usually starts empty and remains empty, [`Stack`] will avoid an allocation.
|
||||
/// This is the same as how [`Vec`] does not allocate until you push a value into it.
|
||||
///
|
||||
/// 2. The type the stack holds is large or expensive to construct, so there's a high cost in having to
|
||||
/// create an initial dummy value (which [`NonEmptyStack`] requires, but [`Stack`] doesn't).
|
||||
///
|
||||
/// To simplify implementation, zero size types are not supported (`Stack<()>`).
|
||||
///
|
||||
/// ## Design
|
||||
/// Designed for maximally efficient `push`, `pop`, and reading/writing the last value on stack
|
||||
/// (although, unlike [`NonEmptyStack`], `last` and `last_mut` are fallible, and not branchless).
|
||||
/// Designed for maximally efficient [`push`], [`pop`], and reading/writing the last value on stack
|
||||
/// ([`last`] / [`last_mut`]). Although, unlike [`NonEmptyStack`], [`last`] and [`last_mut`] are
|
||||
/// fallible, and not branchless. So [`Stack::last`] and [`Stack::last_mut`] are a bit more expensive
|
||||
/// than [`NonEmptyStack`]'s equivalents.
|
||||
///
|
||||
/// The alternative would likely be to use a `Vec`. But `Vec` is optimized for indexing into at
|
||||
/// The alternative would likely be to use a [`Vec`]. But `Vec` is optimized for indexing into at
|
||||
/// arbitrary positions, not for `push` and `pop`. `Vec` stores `len` and `capacity` as integers,
|
||||
/// so requires pointer maths on every operation: `let entry_ptr = base_ptr + index * size_of::<T>();`.
|
||||
///
|
||||
/// In comparison, `Stack` uses a `cursor` pointer, so avoids these calculations.
|
||||
/// This is similar to how `std`'s slice iterators work.
|
||||
/// In comparison, [`Stack`] uses a `cursor` pointer, so avoids these calculations.
|
||||
/// This is similar to how [`std`'s slice iterators] work.
|
||||
///
|
||||
/// [`push`]: Stack::push
|
||||
/// [`pop`]: Stack::pop
|
||||
/// [`last`]: Stack::last
|
||||
/// [`last_mut`]: Stack::last_mut
|
||||
/// [`NonEmptyStack`]: super::NonEmptyStack
|
||||
/// [`NonEmptyStack::new`]: super::NonEmptyStack::new
|
||||
/// [`std`'s slice iterators]: std::slice::Iter
|
||||
pub struct Stack<T> {
|
||||
// Pointer to *after* last entry on stack.
|
||||
cursor: NonNull<T>,
|
||||
|
|
|
|||
Loading…
Reference in a new issue