Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add FromPyObjectBound trait for extracting &str without GIL Refs #3928

Merged
merged 8 commits into from
Mar 8, 2024

Conversation

davidhewitt
Copy link
Member

This is an attempt to solve the problem that FromPyObject for &str needs GIL Refs for versions under 3.10, and also can't be expressed until we add a second lifetime to FromPyObject because the data needs to be borrowed from the input, not from 'py.

I do this by creating a new trait FromPyObjectBound which is expected to be the future form of FromPyObject. I think for now we recommend users not to implement it unless they have the same borrow problem.

For backwards compatibility reasons I move as little code as possible onto this new trait. Just .extract() methods and also argument extraction code use this new trait internally.

On the implementation side, this trait gets a blanket for anything implementing FromPyObject.

Finally, without the gil-refs feature we remove the existing implementation of FromPyObject for &str and instead implement this new trait. This works except for abi3 on old Python versions, where we have to settle for PyFunctionArgument only and allow users of .extract() to break.

This does implement some very targeted breakage for specific considerations, but I wonder if this is necessary as the correct way to users off the GIL Ref API in a case where otherwise it's near impossible for them to realise this uses it.

I can't see a way to do this any gentler.

@davidhewitt
Copy link
Member Author

This does need changelog as it implements FromPyObject for Cow<str>.

@davidhewitt
Copy link
Member Author

As a final thought, I would have said for performance reasons alone this breakage would be unnecessary, but for #3668 safety reasons this seems worth it to avoid users unsuspectingly still having problems.

cc @alex as I think the proposed change to FromPyObject for &str will affect cryptography.

@alex
Copy link
Contributor

alex commented Mar 4, 2024

Yeah this will definitely impact us. Thanks for flagging.

@davidhewitt
Copy link
Member Author

@alex as a user do you think you'd rather have a smooth experience upgrading and know that there are some GIL Ref uses internally, or have a little pain and get cryptography completely migrated?

(The pain will only come on deactivating the gil-refs feature, and I should probably make the migration guide entry better. I ran out of time this morning...)

@alex
Copy link
Contributor

alex commented Mar 4, 2024

From my perspective, our migration process is going to be:

  1. Get upgraded with gil-refs enable
  2. Burn down all uses
  3. Disable gil-refs

I don't think it's super useful to have "well you disabled gil-refs, but actually they're secretly still on in the background".

And yes, the fact that extract::<&str>() is not going to work is annoying, and will cause some pain. But we already have the fallback option of enabling gil-refs :-)

@davidhewitt
Copy link
Member Author

davidhewitt commented Mar 4, 2024

Great, that is how I'm thinking about the migration too. I will aim to get this PR cleaned up with a better migration guide entry shortly.

Copy link

codspeed-hq bot commented Mar 5, 2024

CodSpeed Performance Report

Merging #3928 will improve performances by 50.86%

Comparing davidhewitt:frompyobject (4dad333) with main (fbd5311)

Summary

⚡ 2 improvements
✅ 65 untouched benchmarks

Benchmarks breakdown

Benchmark main davidhewitt:frompyobject Change
extract_str_extract_fail 2.8 µs 2.3 µs +22.21%
extract_str_extract_success 978.9 ns 648.9 ns +50.86%

guide/src/migration.md Outdated Show resolved Hide resolved
Copy link
Contributor

@Icxolu Icxolu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally this looks like a good approach for the migration period. I was wondering whether we should even allow downstream implementations of FromPyObjectBound. Since this is pretty much the next form of FromPyObject (except for the name), maybe it's fine to leave this escape hatch open. I guess only the names would need updating once we remove the current FromPyObject and replace it with this one.

@@ -225,7 +225,8 @@ mod tests {
let py_bytes = PyBytes::new_with(py, 10, |b: &mut [u8]| {
b.copy_from_slice(b"Hello Rust");
Ok(())
})?;
})?
.as_borrowed();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason to not switch to new_bound_with instead now? (Same below)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None other than me being a bit sloppy! To make things more manageable at the beginning of the migration I put #[allow(deprecated)] on quite a few of the test modules, which I think we will need to clean up on the way to 0.22.

///
/// Users are advised against calling this method directly: instead, use this via
/// [`Bound<'_, PyAny>::extract`] or [`Py::extract`].
fn from_py_object_bound(ob: &'a Bound<'py, PyAny>) -> PyResult<Self>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also already call this from_py_object, but maybe it's better to have a clear distinction for now...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the one question which made me hesitate on encouraging users to use this trait too much (and also make this name clunky) is that I think to resolve the TODO in instance.rs:

        // TODO it might be possible to relax this bound in future, to allow
        // e.g. `.extract::<&str>(py)` where `py` is short-lived.
        'py: 'a,

we would need to make this trait accept ob: Borrowed<'a, 'py, PyAny> as the input, as that's basically &Bound with additional flexibility on the relationship between 'a and 'py.

It felt to me that this is a hurdle we can cross down the line, and working with Bound for now keeps it simple...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, using Borrowed there would certainly be an option. I agree that this is a question best answered once we actually removed the gil-refs and do the final migration. We could also seal this trait for now (but maybe that's a bit drastic). Removing a seal in a patch release, if it turns out to be necessary, would also be an option.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's seal this, and leave a note saying that "if you have a need to use this trait, please let us know and help us work out the design".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I pushed an additional commit to seal the trait.

@davidhewitt davidhewitt marked this pull request as ready for review March 5, 2024 23:35
@davidhewitt davidhewitt changed the title try to resolve &str FromPyObject question add FromPyObjectBound adjustment for &str without GIL Refs Mar 5, 2024
@davidhewitt davidhewitt changed the title add FromPyObjectBound adjustment for &str without GIL Refs add FromPyObjectBound trait for extracting &str without GIL Refs Mar 5, 2024
@davidhewitt
Copy link
Member Author

Coverage aside (which I'm now resigned to being a bit lax on while the gil-refs API leads to horrendous duplication), I think this is now good to review.

I'm aiming to draft (probably tomorrow evening) a few notes for the documentation. After that, and with this merged, I think it's time we shipped a beta release!

@davidhewitt
Copy link
Member Author

Just thinking further on this, I think the alternative would be to make one small breaking change to FromPyObject immediately: the lifetime.

If we changed the definition of FromPyObject to be

pub trait FromPyObject<'a, 'py>: Sized {
    fn extract(ob: &'py PyAny) -> PyResult<Self> {
        Self::extract_bound(&ob.as_borrowed())
    }

    fn extract_bound(ob: &'a Bound<'py, PyAny>) -> PyResult<Self> {
        Self::extract(ob.clone().into_gil_ref())
    }

    #[cfg(feature = "experimental-inspect")]
    fn type_input() -> TypeInfo {
        TypeInfo::Any
    }
}

that is, we add the lifetime 'a to FromPyObject now, but otherwise changed nothing about it, that would remove the need for FromPyObjectBound.

We could explain in the migration guide exactly how users update their code. I think the two main cases would be:

  • in implementations of FromPyObject, which could be solved immediately by using '_ for the lifetime:

    // before
    impl<'py> FromPyObject<'py> for T {
    
    // after
    impl<'py> FromPyObject<'_, 'py> for T {
  • in trait bounds, where users could use an HRTB for now:

    // before
    fn foo<'py, T>(obj: &'py PyAny) -> T
        where T: FromPyObject<'py>
    
    // after
    fn foo<'py, T>(obj: &'py PyAny) -> T
        where T: for<'a> FromPyObject<'a, 'py>

I think this would be acceptable breakage in my eyes if it leaves PyO3 in a more consistent state. I actually suspect that many users don't name the FromPyObject trait directly much at all. In pydantic-core there are only 14 lines which would need to be changed for this out of 23.5k.


Maybe the one unresolved problem would be how the #[derive(FromPyObject)] proc macro allows users to set the additional 'a lifetime in the generated code. I think it would be quite reasonable to punt on that initially, and continue to only support the one 'py lifetime in 0.21. We could fix that either as a patch or part of 0.22.

Ungh, I really want to get that beta release out, but also I think the decision about what to do here is worth doing correctly...

😂

@davidhewitt
Copy link
Member Author

davidhewitt commented Mar 6, 2024

Ok, so I tried the above but it doesn't work because that proposed definition of FromPyObject<'a, 'py> with those default methods falls apart. The compiler starts needing 'py: 'a for the default implementation of fn extract.

So I think that we cannot achieve that proposal in 0.21 without more breakage. I think this confirms that such a change will need to wait to 0.23 when we remove fn extract GIL Ref form from the trait entirely.

Glad I explored that, as it makes me confident that proceeding with this PR is the right approach.

@Icxolu
Copy link
Contributor

Icxolu commented Mar 6, 2024

I believe I have also explored this in the past (maybe during the initial FromPyObject migration), but also ran into issues (not sure anymore whether it was that specific one). Sticking with this approach for the migration period sounds good to me.

Copy link
Contributor

@Icxolu Icxolu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidhewitt davidhewitt added this pull request to the merge queue Mar 8, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 8, 2024
@davidhewitt davidhewitt enabled auto-merge March 8, 2024 07:23
@davidhewitt davidhewitt added this pull request to the merge queue Mar 8, 2024
Merged via the queue into PyO3:main with commit 770d9b7 Mar 8, 2024
41 of 42 checks passed
@davidhewitt davidhewitt deleted the frompyobject branch March 8, 2024 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants