Rust2Metta gnd conversion #544

Necr0x0Der · 2024-01-23T16:26:26Z

No description provided.

Necr0x0Der · 2024-01-23T16:51:28Z

@vsbogd , @luketpeterson , this is the first step of automatic conversion btw Rust and foreign (Python in this case) grounded types. This should break nothing but only extend interoperability. Some implementation details might be improved. Feel free to commit to this branch. While it is preferable to make start with an overall design, I'd prefer to make incremental conservative changes in this case. The next steps I consider is

to perform backward (from foreign, e.g. Python, atoms to Rust) conversion. There are multiple ways to do this. (a) do conversion to Rust every time when Python grounded op produces the result (I'm not ok with it); (b) have a generic ForeignGroundedObject type/trait with e.g. to_rust API function, which should be implemented on the foreign language side (like match_ for matchable objects, it will allow automatic conversion from any foreign object to Rust if it is supported) (I'm ok slightly in favour of it); (c) have specific type constructors (e.g. ForeignNumber, ForeignBool, or whatever they'll be called) so one can add them into IntoNumber / Number implementation.
move Bool / Number library from Python stdlib to optional extension if one wants to use native Python arithmetics, etc. Enabling this extension should work seamlessly with custom Python code and Rust functions not implemented in this extension.

luketpeterson

Superficial suggestions to align with the conventions across the rest of the C API.

c/src/metta.rs

luketpeterson · 2024-01-24T01:19:22Z

c/src/metta.rs

@@ -1412,6 +1412,24 @@ pub extern "C" fn grounded_number_to_longlong(n: *const atom_t, res: *mut c_long
    }
 }

+#[no_mangle]
+pub extern "C" fn grounded_bool_to_int(n: *const atom_t, res: *mut c_int) -> bool {


atom_t and atom_ref_t are designed to be interchangeable, but across the C API, we always use atom_ref_t when taking a pointer to a constant atom so the user can call an accessor function with a reference to an atom that lives elsewhere.

One of the reasons @Necr0x0Der implemented it in this way is that I implemented it in this way earlier. And it is my fault, that I didn't add you to the review @luketpeterson. I agree with the comment and I think we need to make all three functions (two I introduced in #543 and one from this PR) look uniformly.

Fixed in 3a49aff

luketpeterson · 2024-01-24T01:23:18Z

c/src/metta.rs

@@ -1412,6 +1412,24 @@ pub extern "C" fn grounded_number_to_longlong(n: *const atom_t, res: *mut c_long
    }
 }

+#[no_mangle]
+pub extern "C" fn grounded_bool_to_int(n: *const atom_t, res: *mut c_int) -> bool {


to_int is no longer accurate in light of above change/ suggestion, changing to to_bool

Across the rest of the API, into implies consuming the source and converting, where get implies access. to is ambiguous and could be read either way. My personal preference is for grounded_bool_get_bool, but it's just a style thing.

On C API side it is possible to implement both consuming or getting API. I was going to implement consuming API, thus I would use signature grounded_bool_into_bool(b: *mut atom_t, res: *mut bool) -> bool {.

Unfortunately we cannot consume on a Python side. Thus the only viable option to use consuming API in Python is cloning atom before calling a function.

grounded_bool_into_bool(b: *mut atom_t, res: *mut bool) -> bool

The convention we have for consuming is to pass by value. ie. grounded_bool_into_bool(b: atom_t, res: *mut bool) -> bool

However, I don't see the benefit of a consuming converter over a non-consuming accessor, followed by an explicit free if the user is done with the atom. Efficiency-wise they're basically identical.

It's really a style thing, but my philosophy with the C api is to keep the number of functions that consume their arguments to a minimum - if all else is equal.

The other reason for a non-consuming accessor is that it allows you to pass an atom_ref_t, like you might get from atom_vec_get, the space_t iterator's next function, etc.

The reason I would prefer consuming accessor is that if into function need to return a more complex object then getter may require copying when consumer can reuse the passed value. It allows caller to decide whether we he needs to copy object additionally or not. With non-consuming accessor one would need to copy value inside accessor.

On the other hand as we anyway has different formats for Python and Rust strings and for other more complex objects then even consuming accessor will need to copy values to convert it into another format.

The reason I would prefer consuming accessor is that if into function...

That reasoning makes sense in the general case. Although we don't have a specific internal object we're accessing aside from primitives that fit in a single CPU word and stings*.

Perhaps we could make a converting API if we find ourselves wrapping a structure that is larger and can be unwrapped without any transformation.

On the other hand as we anyway has different formats for Python and Rust strings and for other more complex objects then even consuming accessor will need to copy values to convert it into another format.

Exactly. Strings would be a place where a consuming accessor could be more efficient except for the fact that there is a potential allocator mismatch and the fact that Rust strings aren't NULL-terminated. So unfortunately I think a single-copy accessor for wrapped Rust Strings (e.g. write into buffer) is probably the best we can do.

fixed in 07af9a3

c/src/metta.rs

python/hyperonpy.cpp

luketpeterson · 2024-01-24T02:06:10Z

There are multiple ways to do this.
(a) do conversion to Rust every time when Python grounded op produces the result (I'm not ok with it);
(b) have a generic ForeignGroundedObject type/trait with e.g. to_rust API function, which should be implemented on the foreign language side (like match_ for matchable objects, it will allow automatic conversion from any foreign object to Rust if it is supported) (I'm ok slightly in favour of it);
(c) have specific type constructors (e.g. ForeignNumber, ForeignBool, or whatever they'll be called) so one can add them into IntoNumber / Number implementation.

As discussed on the call, (a) is the simplest approach although it places a burden on the core (Rust) atom implementations to losslessly represent all data that the foreign type is capable of representing. I can see how we might not want to maintain compatibility to that level of accuracy.

If we want to implement (b), ie. store the native Python types inside the grounded atoms that originate from Python, and then convert to Rust types lazily, on-demand, then the implementation gets a little bit tricky. (still doable) The main complicating factor is the C API that sits between Python and Rust.

I think one implementation could roughly look like:

Each CGrounded atom (which sits behind a native Python atom) would contain a function pointer to a conversion function. This could be NULL for non-convertible atoms
The function would have take an "object" ptr (the pointer returned from atom_get_object) as input, and likely return a new Rust-backed atom as output. For example, an atom that was returned from longlong_to_grounded_number

As up-side to that design is that it places the responsibility for convertibility / promotion logic on the foreign and native atom implementations, which is right, IMO.

The downside is that manufacturing a new atom means a new allocation. So an operation that would normally be a simple memory access becomes much more expensive.

An alternative implementation could look like:

Each CGrounded atom would contain a reference to a table of specific accessor functions. For example get_value_as_longlong, etc.
Each foreign atom implementation could implement the relevant accessors, and leave the others as NULL
The Rust client could call the relevant accessors(s), either directly or behind an abstracted Rust conversion trait.

This approach avoids the unnecessary allocation, at the expense of requiring more API surface area.

vsbogd · 2024-01-24T08:05:29Z

This approach avoids the unnecessary allocation, at the expense of requiring more API surface area.

@luketpeterson, do you mean unnecessary allocation is avoided when Python atom's internals can be reused to represent the value in Rust? Or do you mean that unnecessary GroundedAtom wrapper allocation is avoided?

luketpeterson · 2024-01-24T08:14:21Z

@luketpeterson, do you mean unnecessary allocation is avoided when Python atom's internals can be reused to represent the value in Rust? Or do you mean that unnecessary GroundedAtom wrapper allocation is avoided?

I mean a C callback function that takes the form:

convert_atom: extern "C" fn(src: *const gnd_t) -> atom_t,

involves an unnecessary allocation because the returned atom_t is backed by a dynamic structure (currently in every case. Although I have a vision for a static fast-path someday, that isn't the reality today)

On the other hand:

get_value_as_f64: extern "C" fn(src: *const gnd_t, dst: *mut f64) -> bool,

doesn't involve any allocation. The downside is that we need to make an entry point for every native interchange type, and each foreign (Python) type needs a table of accessor function pointers. The same table can be shared among all Python objects of a given type, so it's not actually that bad.

I think this (second option) is the way to go.

Co-authored-by: luketpeterson <[email protected]>

Replace *const atom_t by *const atom_ref_t as functions input references to the atoms. Add docstrings.

_into_ infix is used for consuming functions, _get_ for non-consuming

vsbogd · 2024-01-26T16:53:32Z

@luketpeterson , please approve if you agree to merge it.

luketpeterson

These look good.

Rust2Metta gnd conversion

d2d6ba0

Necr0x0Der marked this pull request as draft January 23, 2024 16:26

luketpeterson requested changes Jan 24, 2024

View reviewed changes

Necr0x0Der and others added 7 commits January 24, 2024 12:38

Update c/src/metta.rs

d4e5283

Co-authored-by: luketpeterson <[email protected]>

Update c/src/metta.rs

48022a4

Co-authored-by: luketpeterson <[email protected]>

Update c/src/metta.rs

5888eee

Co-authored-by: luketpeterson <[email protected]>

Update python/hyperonpy.cpp

4a63205

Co-authored-by: luketpeterson <[email protected]>

int -> bool

5355307

Fix grounded_number... function signatures

3a49aff

Replace *const atom_t by *const atom_ref_t as functions input references to the atoms. Add docstrings.

Rename _to_ functions to _get_ and _into_

07af9a3

_into_ infix is used for consuming functions, _get_ for non-consuming

Necr0x0Der marked this pull request as ready for review January 25, 2024 11:10

Merge branch 'main' into autoconv

08fe8d2

vsbogd requested a review from luketpeterson January 26, 2024 16:40

vsbogd approved these changes Jan 26, 2024

View reviewed changes

luketpeterson approved these changes Jan 26, 2024

View reviewed changes

luketpeterson merged commit 169da1d into main Jan 26, 2024
1 of 2 checks passed

vsbogd deleted the autoconv branch January 26, 2024 17:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rust2Metta gnd conversion #544

Rust2Metta gnd conversion #544

Necr0x0Der commented Jan 23, 2024

Necr0x0Der commented Jan 23, 2024

luketpeterson left a comment

luketpeterson Jan 24, 2024 •

edited

Loading

vsbogd Jan 24, 2024

vsbogd Jan 25, 2024

luketpeterson Jan 24, 2024

vsbogd Jan 24, 2024

luketpeterson Jan 24, 2024

vsbogd Jan 24, 2024

luketpeterson Jan 24, 2024

vsbogd Jan 25, 2024 •

edited

Loading

luketpeterson commented Jan 24, 2024 •

edited

Loading

vsbogd commented Jan 24, 2024

luketpeterson commented Jan 24, 2024 •

edited

Loading

vsbogd commented Jan 26, 2024

luketpeterson left a comment

Rust2Metta gnd conversion #544

Rust2Metta gnd conversion #544

Conversation

Necr0x0Der commented Jan 23, 2024

Necr0x0Der commented Jan 23, 2024

luketpeterson left a comment

Choose a reason for hiding this comment

luketpeterson Jan 24, 2024 • edited Loading

Choose a reason for hiding this comment

vsbogd Jan 24, 2024

Choose a reason for hiding this comment

vsbogd Jan 25, 2024

Choose a reason for hiding this comment

luketpeterson Jan 24, 2024

Choose a reason for hiding this comment

vsbogd Jan 24, 2024

Choose a reason for hiding this comment

luketpeterson Jan 24, 2024

Choose a reason for hiding this comment

vsbogd Jan 24, 2024

Choose a reason for hiding this comment

luketpeterson Jan 24, 2024

Choose a reason for hiding this comment

vsbogd Jan 25, 2024 • edited Loading

Choose a reason for hiding this comment

luketpeterson commented Jan 24, 2024 • edited Loading

I think one implementation could roughly look like:

An alternative implementation could look like:

vsbogd commented Jan 24, 2024

luketpeterson commented Jan 24, 2024 • edited Loading

vsbogd commented Jan 26, 2024

luketpeterson left a comment

Choose a reason for hiding this comment

luketpeterson Jan 24, 2024 •

edited

Loading

vsbogd Jan 25, 2024 •

edited

Loading

luketpeterson commented Jan 24, 2024 •

edited

Loading

luketpeterson commented Jan 24, 2024 •

edited

Loading