Panic-safety issue with `Zip` specializations #137255

steffahn · 2025-02-19T04:48:48Z

while trying to come up with a more precise formulation of TrustedRandomAccess properties, I wondered

what should size_hint’s behavior be on subsequent next_back calls?
- answer: probably it should keep decrementing size by 1 for each call; i.e. length changes at the back via next_back are tracked in the size information, but __iterator_get_unchecked has no impact on the size
then… what should be the size_hint/size/len change when next_back is called but panics!?

That’s where I took a detour reviewing all the relevant code that uses it. And sure enough, this case wasn’t consistently handled.

E.g. this section

rust/library/core/src/iter/adapters/zip.rs

Lines 374 to 398 in f44efbf

    
           if A::MAY_HAVE_SIDE_EFFECT || B::MAY_HAVE_SIDE_EFFECT { 
        
               let sz_a = self.a.size(); 
        
               let sz_b = self.b.size(); 
        
               // Adjust a, b to equal length, make sure that only the first call 
        
               // of `next_back` does this, otherwise we will break the restriction 
        
               // on calls to `self.next_back()` after calling `get_unchecked()`. 
        
               if sz_a != sz_b { 
        
                   let sz_a = self.a.size(); 
        
                   if A::MAY_HAVE_SIDE_EFFECT && sz_a > self.len { 
        
                       for _ in 0..sz_a - self.len { 
        
                           // since next_back() may panic we increment the counters beforehand 
        
                           // to keep Zip's state in sync with the underlying iterator source 
        
                           self.a_len -= 1; 
        
                           self.a.next_back(); 
        
                       } 
        
                       debug_assert_eq!(self.a_len, self.len); 
        
                   } 
        
                   let sz_b = self.b.size(); 
        
                   if B::MAY_HAVE_SIDE_EFFECT && sz_b > self.len { 
        
                       for _ in 0..sz_b - self.len { 
        
                           self.b.next_back(); 
        
                       } 
        
                   } 
        
               } 
        
           }

containing in particular

rust/library/core/src/iter/adapters/zip.rs

Lines 386 to 387 in f44efbf

    
           self.a_len -= 1; 
        
           self.a.next_back();

– whilst being conditional’d under logic based on self.a.size() –
for sure looks like it assumes that a panicking next_back call will always at least decrement the size.

But – funnily enough – the very same code section lives in a next_back impl – at the beginning of that impl – being code that can panic but up to this point nothing happened that would touch self.index or self.len, the information in Zip that determines size_hint.

rust/library/core/src/iter/adapters/zip.rs

Lines 335 to 338 in f44efbf

    
           fn size_hint(&self) -> (usize, Option<usize>) { 
        
               let len = self.len - self.index; 
        
               (len, Some(len)) 
        
           }

So okay… this is potentially bad… well… it in fact is bad:

use std::{
    iter::zip,
    panic::{catch_unwind, resume_unwind, AssertUnwindSafe},
};

fn main() {
    let mut panic = true;
    let a = [42, 1337];
    let a2 = zip(
        a.iter().map(|i| {
            println!("reading item {i:>11} from slice");
            if panic {
                panic = false;
                resume_unwind(Box::new(()))
            }
        }),
        &[(); 1],
    ); // inner `zip` of length `1`, but `a` has length `2`:
    // so here inner.len == 1, inner.a_len == 2, inner.a.size() == 2

    let mut i = zip(a2, &[(); 0]); // outer `zip` of length `0`, but `a2` has length `1`
    // so here outer.len == 0, outer.a_len == 1, outer.a.size() == 1

    catch_unwind(AssertUnwindSafe(|| i.next_back())).ok(); // outer’s `next_back` tries to shorten outer.a by 1 element first
    // based on outer.a.size() == inner.size() == 1 > 0 == outer.len
    // which decrements `outer.a_len from 1 to 0`, then recurses into `inner`:

    // inner’s `next_back` tries to shorten inner.a by 1 element first
    // based on inner.a.size() == 2 > 1 == inner.len
    // which which decrements `inner.a_len from 2 to 1`, then calls `next_back` on the `Map<slice::Iter>`
    // shortening the slice iterator by 1 element, now `inner.a.size()` is 1
    // then panicking from the `Map`

    // panic is caught. NEW STATE:
    // ===========================
    // inner.len == 1, inner.a_len == 1, inner.a.size() == 1
    // outer.len == 0, outer.a_len == 0, outer.a.size() == … well … inner.len == 1

    // (note that all .index fields are untouched and still 0, else more accurately
    // the formula would be outer.a.size() == inner.len - inner.index)

    // the IMPORTANT effect: outer.a.size() is unchanged, still larger than outer.len.

    i.next_back(); // <- next line of code, one more `next_back`, no more `panic` necessary…
    // outer’s `next_back` tries to shorten outer.a by 1 element first
    // based on outer.a.size() == inner.size() == 1 > 0 == outer.len
    //                                         !!!!!!!!!!!!!!!!!!!!
    // which decrements `outer.a_len from 0 to …INTEGER UNDERFLOWS… to usize::MAX`, then recurses into `inner`:
    //                                         !!!!!!!!!!!!!!!!!!!!

    // inner’s `next_back` needs no further shortening, based on inner.a.size() == 1 == inner.b.size()
    // (which is the so-far untouched iterator over &[(); 1],)
    // so we reach the main logic which decrements inner.len (1 -> 0) and inner.a_len (1 -> 0),
    // and successfully reads the items at position 0

    // NEW STATE:
    // ==========
    // inner.len == 0, inner.a_len == 0, inner.a.size() == 1 [inner.b.size() == 1]
    // outer.len == 0, outer.a_len == usize::MAX, outer.a.size() == inner.len == 0

    // now … the `Zip` is empty (`outer.len == 0`) – okay really it was never not empty but anyway
    // we poll it from the front now, reaching the side-effects simulation code (linked below)

    println!("-----------------------------------");
    i.next(); // each of these now does the action of
/*          self.index += 1;
            self.len += 1;
            // match the base implementation's potential side effects
            // SAFETY: we just checked that `i` < `self.a.len()`
            unsafe {
                self.a.__iterator_get_unchecked(i); // `i` is previous value of `.index`
            }
*/  // which means for the first call, we get outer.len (0 -> 1) and outer.index(0 -> 1)
    // notably, this code is always executed since the overflow we had
    // disarmed the `self.index < self.a_len` conditional – which is always true with `outer.a_len == usize::MAX`
    // so this time we access `__iterator_get_unchecked(0)` which accesses the 42 again

    i.next(); // and this time it accesses `__iterator_get_unchecked(1)` which accesses the 1337 again
    i.next(); // and this time it accesses `__iterator_get_unchecked(2)` which is out of bounds for the slice :-/
}

(side-effect simulation code mentioned above)

rust/library/core/src/iter/adapters/zip.rs

Lines 318 to 329 in f44efbf

    
           } else if A::MAY_HAVE_SIDE_EFFECT && self.index < self.a_len { 
        
               let i = self.index; 
        
               // as above, increment before executing code that may panic 
        
               self.index += 1; 
        
               self.len += 1; 
        
               // match the base implementation's potential side effects 
        
               // SAFETY: we just checked that `i` < `self.a.len()` 
        
               unsafe { 
        
                   self.a.__iterator_get_unchecked(i); 
        
               } 
        
               None 
        
           } else {

output:

reading item        1337 from slice
reading item          42 from slice
-----------------------------------
reading item          42 from slice
reading item        1337 from slice
reading item   957756800 from slice

And here is a playground version that runs this i.next() in a longer loop up to segfault

@rustbot label T-libs, A-iterators, A-specialization, I-unsound

The text was updated successfully, but these errors were encountered:

steffahn added the C-bug Category: This is a bug. label Feb 19, 2025

m-ou-se added P-high High priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Feb 19, 2025

the8472 self-assigned this Feb 19, 2025

jieyouxu removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Feb 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Panic-safety issue with `Zip` specializations #137255

Panic-safety issue with `Zip` specializations #137255

steffahn commented Feb 19, 2025 •

edited

Loading

Panic-safety issue with Zip specializations #137255

Panic-safety issue with Zip specializations #137255

Comments

steffahn commented Feb 19, 2025 • edited Loading

Panic-safety issue with `Zip` specializations #137255

Panic-safety issue with `Zip` specializations #137255

steffahn commented Feb 19, 2025 •

edited

Loading