Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for SyncUnsafeCell #95439

Open
1 of 3 tasks
Tracked by #21
m-ou-se opened this issue Mar 29, 2022 · 21 comments
Open
1 of 3 tasks
Tracked by #21

Tracking Issue for SyncUnsafeCell #95439

m-ou-se opened this issue Mar 29, 2022 · 21 comments
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@m-ou-se
Copy link
Member

m-ou-se commented Mar 29, 2022

Feature gate: #![feature(sync_unsafe_cell)]

This is a tracking issue for SyncUnsafeCell.

Public API

// core::cell

#[repr(transparent)]
pub struct SyncUnsafeCell<T: ?Sized> {
    value: UnsafeCell<T>,
}

unsafe impl<T: ?Sized + Sync> Sync for SyncUnsafeCell<T> {}

// Identical interface as UnsafeCell:

impl<T> SyncUnsafeCell<T> {
    pub const fn new(value: T) -> Self;
    pub const fn into_inner(self) -> T;
}

impl<T: ?Sized> SyncUnsafeCell<T> {
    pub const fn get(&self) -> *mut T;
    pub const fn get_mut(&mut self) -> &mut T;
    pub const fn raw_get(this: *const Self) -> *mut T;
}

impl<T> From<T> for SyncUnsafeCell<T>;
impl<T: Default> Default for SyncUnsafeCell<T>;
impl<T: ?Sized> Debug for SyncUnsafeCell<T>;

Steps / History

Unresolved Questions

  • Name: SyncUnsafeCell? RacyUnsafeCell? RacyCell? UnsafeSyncCell? ..?
  • Should it be Sync only if T is Sync, or should it unconditionally be Sync?
@m-ou-se m-ou-se added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC labels Mar 29, 2022
@bstrie
Copy link
Contributor

bstrie commented Mar 29, 2022

The arguments in favor of having such a type are contained in the (very long) thread at #53639 .

TL;DR: static mut makes it way too easy to get a mutable reference to a static, which is wildly unsound in the face of concurrency, and having users write their own versions of this type is very error prone (see #53639 (comment) for one example). This is an unsafe pattern that should be encapsulated by the stdlib, and might someday lead to the ability to deprecate static mut.

@josephlr
Copy link
Contributor

This would be a great addition to the standard library. Most OS-level crates I've worked on eventually have to use static mut or implement an abstraction like this if they need to have static mutable memory (such as a bootstrap stack or GDT). One of the downsides of static mut is that doing pointer math with addresses to static mut variables has to be marked unsafe. Even addr_of!(STATIC_VAR) needs an unsafe annotation.

This abstractions makes it so raw pointer math would be safe, while use of the pointee to would be unsafe.

A concrete example is from @phil-opp's blog "Writing an OS in Rust" where static mut is used for the TSS Double Fault stack.

@Pointerbender
Copy link
Contributor

Unresolved Question: Should it be Sync only if T is Sync, or should it unconditionally be Sync?

I think the former might be the best, if it's unconditionally Sync then it would be possible to wrap !Send types (e.g. Rc or thread local storage values) in it and make it safe to share these between threads. At the very least it should be Sync if T is Send.

@Raekye
Copy link
Contributor

Raekye commented Apr 13, 2022

  • Should it be Sync only if T is Sync, or should it unconditionally be Sync?

Conceptually, I think T should be required to be Sync. And while we're at it, shouldn't it also require T: Send at the very least, as Pointerbender mentioned?

In practice, I commonly deal with types containing raw pointers (coming from bindgen), which are (very un-ergonomically) !Send and !Sync. So unfortunately, I'll have to continue using my own implementation which is unconditionally Sync, because of raw pointers being unconditionally !Send and !Sync.

@traviscross
Copy link
Contributor

There has been previous discussion that if a new UnsafeCell type were to be added, that it should be both Sync and Copy. Does that have any bearing on this issue?

In particular, it seems as though stabilizing this without a Copy impl may complicate or preclude adding one later for the reasons that have made #55207 and #25053 challenging 1. If we were to want a version of UnsafeCell that implements Copy, is it better to have both SyncUnsafeCell and CopyUnsafeCell or better to have one RawUnsafeCell (a.k.a. (farcically) VeryUnsafeCell, SuperUnsafeCell, etc.) that implements both and from which all others can be derived?

As background:

@SimonSapin proposed this primitive here: #55207 (comment) #53639 (comment)

@aturon discussed adding a separate Copy version of UnsafeCell here: #25053 (comment)

@RalfJung analyzed the situation in some detail: #25053 (comment) #55207 (comment)

cc #55207 #25053 @jamesmunns

Footnotes

  1. That is, UnsafeCell could have been Copy originally, but adding the impl now is hard, just as UnsafeCell could have been Sync originally, but at this point we're instead adding a new struct.

@jamesmunns
Copy link
Member

jamesmunns commented Jun 25, 2022

In general, in my uses of UnsafeCell-made-sync, which are typically data structures that I am managing thread safety manually, e.g. for thread safe queues or allocators or such, I very much would not want it to be Copy. I can hunt down some concrete examples if it would be helpful, but I have a lot of UnsafeCell<MaybeUninit<T>>s in my code for reasons.

I'm not certain if the proposal was something like:

impl<T: Copy> Copy for NewUnsafeCell<T> {}

Which might be more palatable, but it would probably make me nervous that I might ACCIDENTALLY make my types copy, when I really don't want them to be, for example if I happened to use some T: Copy data in my structures.

I think a lot of mine (and maybe others?) desire for a Copy version of UnsafeCell has been abated by the fact that UnsafeCell::new() is now const, and Mara's "Const Single" pattern works, e.g.

const SINGLE: UnsafeCell<NonCopyType> = UnsafeCell::new(NonCopyType::const_new());
let arr: [UnsafeCell<NonCopyType>; 64] = [SINGLE; 64];

There might be others who have that desire! But you pinged me, and I can't say I've ever wanted it, outside of initializers, where the "Const Single" pattern works reasonably well, even if it isn't completely widely known.

Edit:

As a very quick example, bbqueue uses an UnsafeCell<MaybeUninit<[u8; N]>> for a backing storage.

c.f.: https://github.com/jamesmunns/bbqueue/blob/f73423c0b1c5fe04723e5b5bd57d1a44ff106473/core/src/bbbuffer.rs#L22-L54

I feel like it would be a footgun to make this structure Copy, even if MaybeUninit<[u8; N]> could be, and I think it would, given that [u8; N] is Copy.

@SimonSapin
Copy link
Contributor

It looks like I did write a few years ago about a possible SomethingUnsafeCell that would both Sync and Copy, but now I agree with @jamesmunns that it’s probably not a good idea.

SyncUnsafeCell exists mostly to use in static to replace static mut where unintentional accesses can cause UB, whereas always manipulating raw pointers makes it easier to see where they are dereferenced. Making it also Copy would reintroduce that risk of unintentional (and not necessarily properly synchronized) access to a shared static.

@jamesmunns
Copy link
Member

Oh. I just now realized I opened #55207 in 2018, which proposed to make UnsafeCell Copy/Clone (oh how time flies!)

As I mentioned downthread: #55207 (comment), I definitely agree that my original goal was mainly around const/array initialization, which is solved by (better IMO) by #49147.

@RalfJung
Copy link
Member

Which might be more palatable, but it would probably make me nervous that I might ACCIDENTALLY make my types copy, when I really don't want them to be, for example if I happened to use some T: Copy data in my structures.

That would not happen. You still have to add derive(Copy) to your type, which of course you won't.

I really don't understand the problem here.

@jamesmunns
Copy link
Member

In the example I gave, MaybeUninit<[u8; 64]> is Copy because [u8; 64] is Copy, and MaybeUninit<T> is Copy when T is.

Therefore without adding derive(Copy), SuperUnsafeCell<MaybeUninit<[u8; 64]>> is now Copy, which was not previously the case for UnsafeCell<MaybeUninit<[u8; 64]>>, which I might never want to be copy, because it serves as an inner-mutability data store, and any unintentional copies would be a logic error.

@RalfJung
Copy link
Member

RalfJung commented Jun 26, 2022

I assumed you were talking about a type you defined yourself.

Interior mutability data stores can be perfectly fine with Copy (making Cell: Copy would be sound), so it seems like a bad idea to disallow this for everyone and make it outright impossible to have an interior mutable type that is Copy. That's just a fundamental expressiveness gap that Rust currently has, and not for any good reason IMO.

@jamesmunns
Copy link
Member

That's fair. However some of my worst bugs when working with unsafe are when I accidentally create a copy of some data that happens to be Copy (e.g. accidentally copying to the stack when trying to reborrow from a pointer), then returning a reference to a local rather than the non-local pointer. Since UnsafeCell is likely to be used in pointer form (e.g. through the .get() interface, using it with any T: Copy data makes me wary.

This is probably a niche issue (and almost certainly only a personal experience report), but that's my $0.02. If there are other compelling use cases, I'll just have to be more careful in the future, or stick with the old UnsafeCell with a manual Sync impl.

@m-ou-se
Copy link
Member Author

m-ou-se commented Jun 27, 2022

Interior mutability data stores can be perfectly fine with Copy (making Cell: Copy would be sound), so it seems like a bad idea to disallow this for everyone and make it outright impossible to have an interior mutable type that is Copy. That's just a fundamental expressiveness gap that Rust currently has, and not for any good reason IMO.

I think this is a really important point. I think this is case where Rust accidentally mixes 'unsafe' with 'just a bad idea'. The best example of that is how mem::forget is usually a "bad idea", but not unsafe. From that perspective, UnsafeCell does it wrong: it is not Sync, and requires you to write unsafe impl Sync on your type containing it if you want to share it, even though a Sync UnsafeCell is perfectly sound. Same for Copy for Cell. Same for Send and Sync for raw pointers.

At this point we can't just make these types Sync (or Copy, etc.) without breaking existing code, but it's interesting to think about how to represent a concept like "this type is technically Sync (or Copy), but it's usually not what you want" in the language.

@m-ou-se
Copy link
Member Author

m-ou-se commented Jun 27, 2022

It reminds me also of some arguments for whether to implement Copy for Range or not. It's not unsound, but might lead to mistakes. Whether that means it should or shouldn't be done is debatable.

@notgull
Copy link

notgull commented Sep 9, 2022

I support this, although I think it should be explicitly marked as a "static mut killer" in the documentation.

@thomcc
Copy link
Member

thomcc commented Sep 20, 2022

It's not really a static mut killer as is, since it doesn't relax the Send requirement. When I've wanted to use it this has been an issue (since we have a few types which are !Send & !Sync just as lints).

I do see why this limitation exists, though.

@ibraheemdev
Copy link
Member

Would it be better to have a generic AlwaysSync<T> wrapper? Constructing it in general would be unsafe but it could have a safe From<UnsafeCell<T>> implementation.

@tgross35
Copy link
Contributor

tgross35 commented Feb 8, 2023

In practice, I commonly deal with types containing raw pointers (coming from bindgen), which are (very un-ergonomically) !Send and !Sync. So unfortunately, I'll have to continue using my own implementation which is unconditionally Sync, because of raw pointers being unconditionally !Send and !Sync.

Would it be better to have a generic AlwaysSync wrapper? Constructing it in general would be unsafe but it could have a safe From<UnsafeCell> implementation.

I would appreciate having something like this. Creating static structs for things like plugins is really quite painful, especially when they contain raw pointers, because you have to wrap everything in a sync-forcing type (e.g. from one of my proc macros, my UnsafeSyncCell is always Sync). This type is just an UnsafeCell transparant wrapper where new is also unsafe.

I don't think an AlwaysSyncCell needs to be the same as SyncUnsafeCell since they have somewhat different goals (SyncUnsafeCell being more useful for existing Rust types, AlwaysSyncCell being useful more for statics/synchronization with the C interface). But it would be good to keep in mind for naming/design decisions that both concepts do need to exist, be it in std itself or just elsewhere in the ecosystem.

Question just since I haven't seen it anywhere - is unsafe impl<T: ?Sized + Sync> Sync for UnsafeCell <T> {} not possible, or would it somehow be breaking? Or is not having UnsafeCell ever be sync just a form of lint against unintended use?

@SimonSapin
Copy link
Contributor

Some existing unsafe code might rely on the auto-trait behavior of UnsafeCell<T>: !Sync (even for T: Sync) to make their own wrappers !Sync.

For example, RefCell doesn’t and has an explicit impl<T: ?Sized> !Sync for RefCell<T> {} but only to provide better error messages. It would still be sound without it in current Rust.

@thecaralice
Copy link

Another option which I don't see mentioned here is making SyncUnsafeCell always Sync but introducing a warning-level lint for using SyncUnsafeCell<T> with T: ?Sync

@WorldSEnder
Copy link

WorldSEnder commented Aug 14, 2024

That's fair. However some of my worst bugs when working with unsafe are when I accidentally create a copy of some data that happens to be Copy (e.g. accidentally copying to the stack when trying to reborrow from a pointer), then returning a reference to a local rather than the non-local pointer.

While I understand the concerns, this is mostly a matter of discipline and letting the compiler figuring out the necessary reborrowing. Since an *UnsafeCell is not Deref, you wouldn't want to *my_cell except to explicitly use the copy impl if it would exist. If you want to get to the data, you would go through my_cell.get() or get_raw().

To show a use of having a Copyable cell, I'm currently writing a datastructure Foo<T> that for ergonomical reasons I want to be Copy if T: Copy since the user should pretend as though it's just a plain old T, but actually keeps track of some "operation"s that are done to that T behind the scenes. Problematically, I wouldn't be able to observe the copy of these Foo<T>, so to fix that, I would include a counter in it that increases with every "operation" and when an implicit copy is made, I can see the old counter value in the backend to rewind to where - logically - a copy should have been performed.

#[derive(Clone, Copy)]
struct Foo<T> { ... }
impl<T> Foo<T> {
    fn operate(&self); // note *NOT* &mut self. All operations are sequenced in the backend.
}

So I would want to increase the counter in Foo::operate but that should not require a &mut self but a &self. Having that counter in a Cell<usize> would be nice, but alas Cell is not Copy for basically historical reasons. I would really want to have an internally-mutable-yet-copy type although I concede that I do not care as much about Sync issues, since you can fix those by a manual unsafe impl.


EDIT: Tl;dr Thinking and reading more about this, I believe that not having a Copy impl on the *UnsafeCell makes a lot of sense, but we need to have a way to have some consuming type opt out of the structural check of Copy and let it implement Copy despite containing an *UnsafeCell. That would open up the path towards impl<T: Copy> Copy for Cell<T> (which is safe and okay as far as I can gather) and solve my issue anway. Sorry, since it seems entirely unrelated to this issue now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests