Fix issue 530: rename variables to the same name when returning result from match #545

vsbogd · 2024-01-24T06:18:58Z

Fixes #530. PR is a huge rework of Bindings implementation. New implementation is about 15% faster than previous one and logic is simpler. There are some minimal MeTTa tests which fails, the reason is the same as in #532 (comment), so I am going to work on this in separate PR.

Fix Bindings::retain logic, don't remove variables without value because it can lead to loosing information about variable equalities.

New mapping allows setting resulting variable name, decreases the complexity of removing variable from mapping. Thus massive operations now removes variables and bindings instead of keeping them as a garbage in data.

…ogic

luketpeterson · 2024-01-24T07:55:05Z

lib/src/common/holeyvec.rs

This is cool. It's a nice structure with the memory access patterns of a vec, but doesn't need to do a memmove to remove elements from the center.

It has the advantages over a linked list of providing stable indices and constant-time index-based access for items already added.

I wonder if it makes sense to publish this to a stand-alone crate? In any case that's easily done in the future.

Thanks Luke. I was thinking about publishing it as a standalone crate (at least I didn't find anything similar at crates.io, may be I don't know a proper name for this algorithm). It should be effective when capacity is known in advance. But I am not sure about its effectiveness comparing to using Rc<RefCell<T>> when the capacity is not known.

For example in case of Bindings I want to compare performance with using Rc<RefCell<T>> but I decided not spending additional time in this PR. I will try doing it as a part of work on unstable tests in minimal MeTTa.

effectiveness comparing to using Rc<RefCell>

I don't follow... You mean compared with HashSet<T>? Rc<RefCell<T>> isn't a collection, but HoleyVec is.

I believe Vec does some kind of speculative over-allocation internally, but I'm not totally clear on the details. I considered suggesting a with_capacity() flavor of constructor function, but I didn't want to ask you to over-engineer it if you didn't think it was necessary.

Also, the "capacity" function probably should take into account the inner Vec's unused capacity. (https://doc.rust-lang.org/std/vec/struct.Vec.html#method.capacity) For example:

pub fn capacity(&self) -> usize { self.vec.capacity() }

You mean compared with HashSet? Rc<RefCell> isn't a collection, but HoleyVec is.

I mean we could replace pair HashMap<VariableAtom, usize> + HoleyVec<Binding> by HashMap<VariableAtom, Rc<RefCell<Binding>> and then it will do memory management in a more atomic way instead of allocating a big enough vector. After I implemented HoleyVec I thought that it basically does memory management and only advantage before alloc is that size of the components is known. If the size of the vector is known in advance HoleyVec should be more effective as memory is preallocated but otherwise may be alloc should do it even faster.

Also, the "capacity" function probably should take into account the inner Vec's unused capacity.

Probably capacity is misleading name. I named it capacity but I introduced it specifically to calculate the max index which were ever used to allocate bitmaps with flags. I will rename it.

I see. Thanks for that explanation.

Renamed in e399189

luketpeterson · 2024-01-24T08:03:02Z

lib/Cargo.toml

@@ -12,6 +12,7 @@ env_logger = "0.8.4"
 directories = "5.0.1" # For Environment to find platform-specific config location
 smallvec = "1.10.0"
 im = "15.1.0"
+bitset = "0.1.2"


bitset's implementation isn't very optimal, but it'll do for now. If it ever shows up as a bottleneck we have a lot of experience writing crazy-fast bit-set operations ;-)

ngeiswei

It's largely going over my head, but it looks cool.

ngeiswei · 2024-01-24T11:33:34Z

lib/src/atom/matcher.rs

@@ -78,7 +78,7 @@ enum VarResolutionResult<T> {
 /// Abstraction of the variable set. It is used to allow passing both
 /// HashSet<&VariableAtom> and HashSet<VariableAtom> to the
 /// [Bindings::narrow_vars] method.
-pub trait VariableSet {
+pub trait VariableSet : Debug {


I'm no rust expert but is it normal to keep that?

The style-police could come down either way on this.

The Rust stdlib tends to avoid traits bounding other traits when possible and instead puts the bounds on the impls / generics. But that can get a lot more verbose.

My personal feeling is that the compiler will do dead-code stripping on the whole binary for release, so if the code path isn't needed then it shouldn't bloat the binary. And knowing trait objects of that type also implement Debug is very convenient and doesn't hurt anything in practice.

ngeiswei · 2024-01-25T12:25:44Z

Thanks Vitaly!

vsbogd added 4 commits January 23, 2024 12:20

Add unit test for incorrect match

4842d10

Rename Bindings::cleanup to Bindings::retain

39016a8

Fix Bindings::retain logic, don't remove variables without value because it can lead to loosing information about variable equalities.

Rewrite Bindings using another way of mapping vars on values

044ca89

New mapping allows setting resulting variable name, decreases the complexity of removing variable from mapping. Thus massive operations now removes variables and bindings instead of keeping them as a garbage in data.

Remove IntoIter impl for Bindings to remove difference in iterators l…

feec8ca

…ogic

vsbogd requested review from ngeiswei and luketpeterson January 24, 2024 06:18

vsbogd changed the title ~~Fix issue 530~~ Fix issue 530: rename variables to the same name when returning result from match Jan 24, 2024

luketpeterson previously approved these changes Jan 24, 2024

View reviewed changes

ngeiswei previously approved these changes Jan 24, 2024

View reviewed changes

Rename HoleyVec::capacity to index_upper_bound

e399189

vsbogd dismissed stale reviews from ngeiswei and luketpeterson via e399189 January 25, 2024 05:45

vsbogd requested a review from luketpeterson January 25, 2024 06:51

luketpeterson approved these changes Jan 25, 2024

View reviewed changes

vsbogd merged commit edd23aa into trueagi-io:main Jan 25, 2024
1 of 2 checks passed

vsbogd deleted the fix-issu-530 branch January 25, 2024 15:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issue 530: rename variables to the same name when returning result from match #545

Fix issue 530: rename variables to the same name when returning result from match #545

vsbogd commented Jan 24, 2024

luketpeterson Jan 24, 2024 •

edited

Loading

luketpeterson Jan 24, 2024

vsbogd Jan 24, 2024

luketpeterson Jan 24, 2024

vsbogd Jan 24, 2024

luketpeterson Jan 25, 2024

vsbogd Jan 25, 2024

luketpeterson Jan 24, 2024

ngeiswei left a comment

ngeiswei Jan 24, 2024

luketpeterson Jan 24, 2024

ngeiswei commented Jan 25, 2024

Fix issue 530: rename variables to the same name when returning result from match #545

Fix issue 530: rename variables to the same name when returning result from match #545

Conversation

vsbogd commented Jan 24, 2024

luketpeterson Jan 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ngeiswei left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ngeiswei commented Jan 25, 2024

luketpeterson Jan 24, 2024 •

edited

Loading