Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue 530: rename variables to the same name when returning result from match #545

Merged
merged 5 commits into from
Jan 25, 2024

Conversation

vsbogd
Copy link
Collaborator

@vsbogd vsbogd commented Jan 24, 2024

Fixes #530. PR is a huge rework of Bindings implementation. New implementation is about 15% faster than previous one and logic is simpler. There are some minimal MeTTa tests which fails, the reason is the same as in #532 (comment), so I am going to work on this in separate PR.

Fix Bindings::retain logic, don't remove variables without value because
it can lead to loosing information about variable equalities.
New mapping allows setting resulting variable name, decreases the
complexity of removing variable from mapping. Thus massive operations
now removes variables and bindings instead of keeping them as a garbage
in data.
@vsbogd vsbogd changed the title Fix issue 530 Fix issue 530: rename variables to the same name when returning result from match Jan 24, 2024
luketpeterson
luketpeterson previously approved these changes Jan 24, 2024
Copy link
Contributor

@luketpeterson luketpeterson Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool. It's a nice structure with the memory access patterns of a vec, but doesn't need to do a memmove to remove elements from the center.

It has the advantages over a linked list of providing stable indices and constant-time index-based access for items already added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense to publish this to a stand-alone crate? In any case that's easily done in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Luke. I was thinking about publishing it as a standalone crate (at least I didn't find anything similar at crates.io, may be I don't know a proper name for this algorithm). It should be effective when capacity is known in advance. But I am not sure about its effectiveness comparing to using Rc<RefCell<T>> when the capacity is not known.

For example in case of Bindings I want to compare performance with using Rc<RefCell<T>> but I decided not spending additional time in this PR. I will try doing it as a part of work on unstable tests in minimal MeTTa.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

effectiveness comparing to using Rc<RefCell>

I don't follow... You mean compared with HashSet<T>? Rc<RefCell<T>> isn't a collection, but HoleyVec is.

I believe Vec does some kind of speculative over-allocation internally, but I'm not totally clear on the details. I considered suggesting a with_capacity() flavor of constructor function, but I didn't want to ask you to over-engineer it if you didn't think it was necessary.

Also, the "capacity" function probably should take into account the inner Vec's unused capacity. (https://doc.rust-lang.org/std/vec/struct.Vec.html#method.capacity) For example:

    pub fn capacity(&self) -> usize {
        self.vec.capacity()
    }

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean compared with HashSet? Rc<RefCell> isn't a collection, but HoleyVec is.

I mean we could replace pair HashMap<VariableAtom, usize> + HoleyVec<Binding> by HashMap<VariableAtom, Rc<RefCell<Binding>> and then it will do memory management in a more atomic way instead of allocating a big enough vector. After I implemented HoleyVec I thought that it basically does memory management and only advantage before alloc is that size of the components is known. If the size of the vector is known in advance HoleyVec should be more effective as memory is preallocated but otherwise may be alloc should do it even faster.

Also, the "capacity" function probably should take into account the inner Vec's unused capacity.

Probably capacity is misleading name. I named it capacity but I introduced it specifically to calculate the max index which were ever used to allocate bitmaps with flags. I will rename it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for that explanation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed in e399189

@@ -12,6 +12,7 @@ env_logger = "0.8.4"
directories = "5.0.1" # For Environment to find platform-specific config location
smallvec = "1.10.0"
im = "15.1.0"
bitset = "0.1.2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bitset's implementation isn't very optimal, but it'll do for now. If it ever shows up as a bottleneck we have a lot of experience writing crazy-fast bit-set operations ;-)

ngeiswei
ngeiswei previously approved these changes Jan 24, 2024
Copy link
Contributor

@ngeiswei ngeiswei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's largely going over my head, but it looks cool.

@@ -78,7 +78,7 @@ enum VarResolutionResult<T> {
/// Abstraction of the variable set. It is used to allow passing both
/// HashSet<&VariableAtom> and HashSet<VariableAtom> to the
/// [Bindings::narrow_vars] method.
pub trait VariableSet {
pub trait VariableSet : Debug {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm no rust expert but is it normal to keep that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The style-police could come down either way on this.

The Rust stdlib tends to avoid traits bounding other traits when possible and instead puts the bounds on the impls / generics. But that can get a lot more verbose.

My personal feeling is that the compiler will do dead-code stripping on the whole binary for release, so if the code path isn't needed then it shouldn't bloat the binary. And knowing trait objects of that type also implement Debug is very convenient and doesn't hurt anything in practice.

@vsbogd vsbogd dismissed stale reviews from ngeiswei and luketpeterson via e399189 January 25, 2024 05:45
@vsbogd vsbogd requested a review from luketpeterson January 25, 2024 06:51
@vsbogd vsbogd merged commit edd23aa into trueagi-io:main Jan 25, 2024
1 of 2 checks passed
@ngeiswei
Copy link
Contributor

Thanks Vitaly!

@vsbogd vsbogd deleted the fix-issu-530 branch January 25, 2024 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unification fails to substitute via most specific unifier when variables are bound to variables
3 participants