-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rust2Metta gnd conversion #544
Conversation
@vsbogd , @luketpeterson , this is the first step of automatic conversion btw Rust and foreign (Python in this case) grounded types. This should break nothing but only extend interoperability. Some implementation details might be improved. Feel free to commit to this branch. While it is preferable to make start with an overall design, I'd prefer to make incremental conservative changes in this case. The next steps I consider is
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Superficial suggestions to align with the conventions across the rest of the C API.
c/src/metta.rs
Outdated
@@ -1412,6 +1412,24 @@ pub extern "C" fn grounded_number_to_longlong(n: *const atom_t, res: *mut c_long | |||
} | |||
} | |||
|
|||
#[no_mangle] | |||
pub extern "C" fn grounded_bool_to_int(n: *const atom_t, res: *mut c_int) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
atom_t
and atom_ref_t
are designed to be interchangeable, but across the C API, we always use atom_ref_t
when taking a pointer to a constant atom so the user can call an accessor function with a reference to an atom that lives elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the reasons @Necr0x0Der implemented it in this way is that I implemented it in this way earlier. And it is my fault, that I didn't add you to the review @luketpeterson. I agree with the comment and I think we need to make all three functions (two I introduced in #543 and one from this PR) look uniformly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 3a49aff
c/src/metta.rs
Outdated
@@ -1412,6 +1412,24 @@ pub extern "C" fn grounded_number_to_longlong(n: *const atom_t, res: *mut c_long | |||
} | |||
} | |||
|
|||
#[no_mangle] | |||
pub extern "C" fn grounded_bool_to_int(n: *const atom_t, res: *mut c_int) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_int
is no longer accurate in light of above change/ suggestion, changing to to_bool
Across the rest of the API, into
implies consuming the source and converting, where get
implies access. to
is ambiguous and could be read either way. My personal preference is for grounded_bool_get_bool
, but it's just a style thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On C API side it is possible to implement both consuming or getting API. I was going to implement consuming API, thus I would use signature grounded_bool_into_bool(b: *mut atom_t, res: *mut bool) -> bool {
.
Unfortunately we cannot consume on a Python side. Thus the only viable option to use consuming API in Python is cloning atom before calling a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grounded_bool_into_bool(b: *mut atom_t, res: *mut bool) -> bool
The convention we have for consuming is to pass by value. ie. grounded_bool_into_bool(b: atom_t, res: *mut bool) -> bool
However, I don't see the benefit of a consuming converter over a non-consuming accessor, followed by an explicit free
if the user is done with the atom. Efficiency-wise they're basically identical.
It's really a style thing, but my philosophy with the C api is to keep the number of functions that consume their arguments to a minimum - if all else is equal.
The other reason for a non-consuming accessor is that it allows you to pass an atom_ref_t
, like you might get from atom_vec_get
, the space_t iterator's next function, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I would prefer consuming accessor is that if into
function need to return a more complex object then getter may require copying when consumer can reuse the passed value. It allows caller to decide whether we he needs to copy object additionally or not. With non-consuming accessor one would need to copy value inside accessor.
On the other hand as we anyway has different formats for Python and Rust strings and for other more complex objects then even consuming accessor will need to copy values to convert it into another format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I would prefer consuming accessor is that if into function...
That reasoning makes sense in the general case. Although we don't have a specific internal object we're accessing aside from primitives that fit in a single CPU word and stings*.
Perhaps we could make a converting API if we find ourselves wrapping a structure that is larger and can be unwrapped without any transformation.
On the other hand as we anyway has different formats for Python and Rust strings and for other more complex objects then even consuming accessor will need to copy values to convert it into another format.
Exactly. Strings would be a place where a consuming accessor could be more efficient except for the fact that there is a potential allocator mismatch and the fact that Rust strings aren't NULL-terminated. So unfortunately I think a single-copy accessor for wrapped Rust Strings (e.g. write into buffer
) is probably the best we can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 07af9a3
As discussed on the call, (a) is the simplest approach although it places a burden on the core (Rust) atom implementations to losslessly represent all data that the foreign type is capable of representing. I can see how we might not want to maintain compatibility to that level of accuracy. If we want to implement (b), ie. store the native Python types inside the grounded atoms that originate from Python, and then convert to Rust types lazily, on-demand, then the implementation gets a little bit tricky. (still doable) The main complicating factor is the C API that sits between Python and Rust. I think one implementation could roughly look like:
As up-side to that design is that it places the responsibility for convertibility / promotion logic on the foreign and native atom implementations, which is right, IMO. The downside is that manufacturing a new atom means a new allocation. So an operation that would normally be a simple memory access becomes much more expensive. An alternative implementation could look like:
This approach avoids the unnecessary allocation, at the expense of requiring more API surface area. |
@luketpeterson, do you mean unnecessary allocation is avoided when Python atom's internals can be reused to represent the value in Rust? Or do you mean that unnecessary |
I mean a C callback function that takes the form:
involves an unnecessary allocation because the returned atom_t is backed by a dynamic structure (currently in every case. Although I have a vision for a static fast-path someday, that isn't the reality today) On the other hand:
doesn't involve any allocation. The downside is that we need to make an entry point for every native interchange type, and each foreign (Python) type needs a table of accessor function pointers. The same table can be shared among all Python objects of a given type, so it's not actually that bad. I think this (second option) is the way to go. |
Co-authored-by: luketpeterson <[email protected]>
Co-authored-by: luketpeterson <[email protected]>
Co-authored-by: luketpeterson <[email protected]>
Co-authored-by: luketpeterson <[email protected]>
Replace *const atom_t by *const atom_ref_t as functions input references to the atoms. Add docstrings.
_into_ infix is used for consuming functions, _get_ for non-consuming
@luketpeterson , please approve if you agree to merge it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These look good.
No description provided.