Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support &[CustomType] #520

Open
Walter-Reactor opened this issue Jul 2, 2024 · 12 comments
Open

Support &[CustomType] #520

Walter-Reactor opened this issue Jul 2, 2024 · 12 comments

Comments

@Walter-Reactor
Copy link

Walter-Reactor commented Jul 2, 2024

I'm working on a project where I need to expose a multi-language API, and I love diplomat's approach. However, the one big hole in Diplomat that makes it unusable for us is exposing `&[UserDefinedStruct]. Is this within scope and do you have any suggestions on how it might be integrated?

@Manishearth
Copy link
Contributor

How do you envision this working over FFI?

We do support output iterables, and could support input iterables, and that's probably the cleanest way to do it. But any type that has its own ownership that needs to be converted across the boundary is a lot of extra cost. So it really depends on what you're trying to do.

It would be helpful if we could see the shape of the APIs you're trying to expose. Vec<CustomType> can mean a lot of things, despite being a singular concept in Rust when we start talking about FFI it starts meaning more things.

It is possible to implement your own wrapping type that is internally a Vec<CustomType> and expose the APIs you need.

@Walter-Reactor
Copy link
Author

Walter-Reactor commented Jul 3, 2024

Edited the topic and my post to better express intent. Sorry for any confusion, I'm still coming up on Rust.

I'm trying to construct a FFI wrapper around a custom serialization format, a little like FlexBuffers. We have one layer that operates on a pointer to data & a runtime loaded schema, and another using generated code to move a lot of the offsets to compile time. One thing we allow is that that structs may contain one of 2 different custom containers (Map/Vec equivalents our own stable ABI & in-memory layout), but there's no way to express this in diplomat.

Considering it, I think what we really need is the ability to support custom container types, not just slices of user defined types.

@Walter-Reactor Walter-Reactor changed the title Support Vec<CustomType> Support &[CustomType] Jul 3, 2024
@Manishearth
Copy link
Contributor

Manishearth commented Jul 4, 2024

I'll write more later but to highlight: &[CustomType] in the context of Diplomat still means two things: exposing a rust slice to other languages or vice versa. The former is somewhat supported with indexing/etc methods on a custom wrapper type. It appears that you're looking for the other way: using a custom, say, Java container from Rust. (please lmk if that's not the case)

Are you specifically looking for common stdlib types to work with this, or the ability for a custom collection to implement something that allows it to be passed to Rust? The former is not that hard to design, the latter needs callback support which we might get at some point but don't currently have.

@Walter-Reactor
Copy link
Author

The more I think about it the more I think we'll probably need custom generic collection support, which is gonna be a heck of a thing to generate automatically no matter what. I may wind up just using cbindgen and writing custom language specific stuff, since we mostly only care about rust, c++, and python as potential target languages

@Manishearth
Copy link
Contributor

To be clear we're already able to get some custom collection support with how we handle outputting strings: we're able to support different language string types and our older c++ backend even supported custom types that fit a template trait.

So it may not be that hard but I'd really need to know what the precise functionality is.

@Walter-Reactor
Copy link
Author

I'm starting to dig into this and hack together support in a private branch. The use case is being able to return something equivalent to a Vec<Box>, or to accept a &[StructType].

Over FFI, the former would wind up working a lot like the DiplomatWritable type, except without the terminating 0 byte & an alignment field. Then in any language with template support, the alignment field gets set to be sizeof(T) and the underlying DiplomatVecRaw type gets coerced into DiplomatVec. DiplomateWrite then becomes almost a special case of DiplomatVec.

For the latter, The requirement should be to accept &[StructType] or &[Box]. The function I'm interested in takes a slice of tuple of f64s, and returns a dynamically sized array of OpaqueTypes (It's a 2d geomtetric lookup table)

@Manishearth
Copy link
Contributor

Manishearth commented Jan 14, 2025

For returning a vector; have you considered using the now-existing iterator support? This would involve returning an iterable, which the caller can collect into a vector. I'd be in favor of additional annotations that can be applied to iterator methods to add a companion "make a vector" method?

Which language are you primarily targeting?

Before you spend too much time on this: we may or may not wish to accept patches for this depending on the complexity introduced and the nature of this design. Maintaining a fork, is, of course, fine.

@Walter-Reactor
Copy link
Author

Oh! I hadn't found the iterator support. That's probably a viable option, I'll give that a shot. We're primarily targeting C, C++ and Python (I've written a kinda hacky backend that spits out a nanobind.cpp file). We're also interested in a Rust binding as well, odd as that may sound, so that we can ship a .dll with a .rs file.

@Walter-Reactor
Copy link
Author

Looking at it, Iterator support could likely work for this case, but I'd need to add support to the c/++ generators. It's also a little clunky because I'd need to write a custom iterable type for every T I wanted to return, which is a bit cumbersome.
Would you be interested in some more detailed coordination/collaboration I'd be happy to try and upstream some of the feature development I'm doing, but I could use some pointers. Our company is going to need to be relying on Diplomat or something similar for the forseeable future so I'm likely to need to work with it a lot.

@Manishearth
Copy link
Contributor

Yep, std::iterator is incompatible with what we do since it requires contiguous memory, but generator support would be welcome.

It's also a little clunky because I'd need to write a custom iterable type for every T I wanted to return, which is a bit cumbersome.

It would be acceptable to add Diplomat support for #[diplomat::make_iterator(FooIterator)] fn foo(&self) -> impl Iterator<Item = Foo> . It would unfortunately need impl trait in type aliases to work with that syntax, if you're comfortable with nightly. Otherwise we could temporarily make it work by boxing and deal with optimizing that later.

@Manishearth
Copy link
Contributor

Would you be interested in some more detailed coordination/collaboration I'd be happy to try and upstream some of the feature development I'm doing

Yes please! I don't have time to write code for features you need but I can advise on implementation strategies that will end up with something we can upstream.

We'd also love to upstream Python and Rust support. The DLL case is one we do care about, and have thought about doing something similar in the past.

@Walter-Reactor
Copy link
Author

Unfortunately nightly is a no-go for use, so we're stuck with stable only for the time being. I'm making good headway on replacing DiplomatWrite with DiplomatVec, which will solve my use case a little better here I think.

It's entirely possible to create stl compliant iterators that don't require contiguous memory & work basically like generators, so I think adding that support should be possible though it'll need some runtime library support I think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants