Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature std compatibility #13

Draft
wants to merge 56 commits into
base: main
Choose a base branch
from

Conversation

cehteh
Copy link
Contributor

@cehteh cehteh commented Jan 23, 2025

Note this is work in progress and contains breaking changes.

This is a very early draft. method names and semantic may change.
* introduces a helper union to fix simplify alignment calculations
  no need for cmp and phantom data anymore
* add simple testcase that triggered the miri issue
* change end_ptr_atomic_mut(), using the rewritten start_ptr()
These methods have the same API/Semantic than the std::Vec methods.
They are useful when one wants to extend a HeaderVec in place in some complex way
like for example appending chars to a u8 headervec with `encode_utf8()`
Having a HeaderVec being zero length is mostly useless because it has to reallocated instanly
when anything becomes pushed. This clearly should be avoided!

Nevertheless supporting zero length takes out a corner-case and a potential panic and removes
the burden for users explicitly ensuring zero length HeaderVecs don't happen in
practice. Generally improving software reliability.
Since we always point to a valid allocation we can use NonNull here.
This will benefit from the niche optimization:
 size_of::<HeaderVec<H,T>>() == size_of::<Option<HeaderVec<H,T>>>

Also adds a test to verfiy that HeaderVec are always lean and niche optimized.
These methods have the same API/Semantic than the std::Vec methods.
They are useful when one wants to extend a HeaderVec in place in some complex way
like for example appending chars to a u8 headervec with `encode_utf8()`
Having a HeaderVec being zero length is mostly useless because it has to reallocated instanly
when anything becomes pushed. This clearly should be avoided!

Nevertheless supporting zero length takes out a corner-case and a potential panic and removes
the burden for users explicitly ensuring zero length HeaderVecs don't happen in
practice. Generally improving software reliability.
For backwards compatibility this is not (yet) enabled by default.
Will be used for all API's that require functionality provided by the rust stdlib.
This makes it comply with the std Vec's API
This is a safety measure, normally resize_cold() wouldn't been called when here is nothing to
do. But we want to ensure that if this ever happens we don't run into a panicking corner case.
This introduce a breaking change starting to modifying all methods that
indicate a realloc by returning a Option(*const()) into stdlib compatible
signatures and variants taking a closure for the fixup.

This commits starts with reserve and shrink methods, more in the following commits.

The `WeakFixupFn` is required for Drain and other iterators which do significant work
in the destructor (possibly reallocating the headervec there). The closure
can be passed along and called when required.

Compatibility note:
When it is not trivially possible to refactor fixup functionality to a closure then
old code can be migrated by:

  let maybe_realloc: Option<*const ()> = {
      let mut opt_ptr = None;
      hv.reserve_with_weakfix(&mut self, 10000, |ptr| opt_ptr = Some(ptr));
      opt_ptr
  };

  // now maybe_realloc is like the old Option<*const ()> return

  if let Some(ptr) = maybe_realloc {
        // fixup code using ptr
  }

Note 2:
do we want legacy wrapper functions that have the old behavior eg.

  #[deprecated("upgrade to the new API")]
  pub fn reserve_legacy(&mut self, additional: usize) -> Option<*const ()> {
      let mut opt_ptr = None;
      self.reserve_with_weakfix(additional, |ptr| opt_ptr = Some(ptr));
      opt_ptr
  };
@cehteh cehteh force-pushed the feature-std-compatibility branch from 57058bf to 9775e59 Compare January 23, 2025 16:48
These are mostly the std Vec compatible From impls. By default this creates `HeaderVec<(), T>`
Additionally when one wants to pass a header one can do that by a `WithHeader(H,T)` tuple
struct. The later is mostly for convenience.

I took my liberty to introduce my xmacro crate here. The macro expands to ~120 lines of code.
When this dependency is not wanted it could be replaced with the expanded code.
This simplifies the `From` impls and should generally be more useful.
We could enabled these in non-std envorinments since HeaderVec depends on alloc, but i delay
the decision about this for now.
This removes the xmacro in favor of a generic From implementation for
`H: Default` and data constructed from `AsRef<[T]>`
@cehteh cehteh force-pushed the feature-std-compatibility branch from a5cef85 to 09d6827 Compare January 24, 2025 13:06
Software that uses this crate in a no_std environment will now have to clear the `std` flag
manually.

Rationale:

Creating documentation and testing things that require `std` would require a lot conditional
compilation tricks. Things would easily go under the Radar when buildiong doc and running
tests with `no_std` being the default.

Most users likely expect the `std` feature to be enabled by default.

Still if it this is a problem we can keep `no_std` being the default with some effort.
This add the drain functionality similar to std Vec's drain to HeaderVec.
The `with_weakfix()` things are not needed for Drain (I was wrong in a earlier commit message)
but they will be required for upcoming Splice functionality.

Since vec::Drain depends on a few nightly features internally but we want to stay compatible
with stable a few things are backported from nightly in `future_slice`. OTOH we can already
stabilize Drain::keep_rest().

Most code was taken from std::vec and minimally adapted to work for HeaderVec.
@cehteh cehteh force-pushed the feature-std-compatibility branch from 89d6c86 to a421e3f Compare January 30, 2025 21:57
cehteh added 21 commits January 30, 2025 22:58
These are prerequisite for 'impl Extend' which is prerequisite for splicing, coming soon.
This should make no difference, but the upcoming Slice needs to use it twice so we can't
pass by value.
* add tail_len() method
upcoming splice needs this, otherwise incrementing the tail_start would require decrementing
tail_len as well.
HeaderVec does not support ZST's, this will give a compiletime error when one tries to do so.
Before it was a Divide by Zero runtime error in the offset calculation.

Remove the ZST code from Drain too (was some remains from std Drain)

Its unlikely that supporting ZST's in HeaderVec makes any sense. Forbiding this altogether
make this clear. When someone comes up with a real use case this may be reviewed.
This implements splicing. This allows for insert/replace operations in HeaderVecs.

The semantic is the same as the stdlib splice. But we use a slightly different algorithm
internally:
* The stdlib Splice collects excess replace_with elements in a temporary vector and later
  fills this in. in some/worst cases this may cause the tail to be moved arouind two times.
* Our implementation pushes objects from `replace_with` directly onto the source HeaderVec
  If tail elements would be overwritten they are stashed on a temporary vector.
  Only these stashed elements are moved twice, the remainder of the tail is moved only once.

Note: moving elements because of reallocation is not accounted here. In both implementations
this can happen equally bad. The stdlib can mitigate that by this by some unstable features.
We try hard to do it as best as possible.

Benchmarks will follow.
This shows that splice has to be optimized
Off-By-One error when rouding to the next offset.
I totally missed that a bug slipped into the offset calculation because
I only tested with u8 and i32 data.
My initial idea stashing the tail aside didnt worked out as well as i hoped for.
Also adds a good deal of tests generated by xmacros. Some may be removed later, but I
tried do cover some corner cases (begin/middle/end, larger/same/shorter replacements)

Since we using a a lot unsafe code eventually all code paths should be checked
(see cargo mutants). The existing tests passing under miri.
This adds a lot benchmarks that compares HeaderVec std Vec. Results show that we are in the
same ballpark than std Vec.

Also enables debug symbols for bench builds as this is quasi-required for some profilers and
flamegraphs.
These public but undocumented and unused, i added them in hindsight but probably they are not
required. Otherwise they could be re-added later.
@cehteh
Copy link
Contributor Author

cehteh commented Feb 20, 2025

The most needed (by me) things are now done: Drain and Splice iterators.

This branch fails the 'no-std' check. How shall that be resolved?

from_header_slice() clones the elements from a slice, in many cases this wont be optimal.
from_header_elements() fixes this by consuming elements for a HeaderVec from a IntoIterator
This adds tests/mutants.rs which was created by githup copilot assistance.

Note:
This file is partially is generated with github copilot assistance.
The *only* objective here is to generate coverage over all code paths to
make 'cargo mutants' pass. Then the main test suite should pass
'cargo +nightly miri test'. Unless otherwise noted the tests here are
not extensively reviewed for semantic correctness. Eventually human
reviewed tests here should be moved to other unit tests.

Hard to test side effects and functions that are trivially correct and
are marked with #[mutants::skip].
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant