-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue 530: rename variables to the same name when returning result from match #545
Conversation
Fix Bindings::retain logic, don't remove variables without value because it can lead to loosing information about variable equalities.
New mapping allows setting resulting variable name, decreases the complexity of removing variable from mapping. Thus massive operations now removes variables and bindings instead of keeping them as a garbage in data.
lib/src/common/holeyvec.rs
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool. It's a nice structure with the memory access patterns of a vec, but doesn't need to do a memmove
to remove elements from the center.
It has the advantages over a linked list of providing stable indices and constant-time index-based access for items already added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it makes sense to publish this to a stand-alone crate? In any case that's easily done in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Luke. I was thinking about publishing it as a standalone crate (at least I didn't find anything similar at crates.io, may be I don't know a proper name for this algorithm). It should be effective when capacity
is known in advance. But I am not sure about its effectiveness comparing to using Rc<RefCell<T>>
when the capacity is not known.
For example in case of Bindings
I want to compare performance with using Rc<RefCell<T>>
but I decided not spending additional time in this PR. I will try doing it as a part of work on unstable tests in minimal MeTTa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
effectiveness comparing to using Rc<RefCell>
I don't follow... You mean compared with HashSet<T>
? Rc<RefCell<T>>
isn't a collection, but HoleyVec
is.
I believe Vec does some kind of speculative over-allocation internally, but I'm not totally clear on the details. I considered suggesting a with_capacity()
flavor of constructor function, but I didn't want to ask you to over-engineer it if you didn't think it was necessary.
Also, the "capacity" function probably should take into account the inner Vec's unused capacity. (https://doc.rust-lang.org/std/vec/struct.Vec.html#method.capacity) For example:
pub fn capacity(&self) -> usize {
self.vec.capacity()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean compared with HashSet? Rc<RefCell> isn't a collection, but HoleyVec is.
I mean we could replace pair HashMap<VariableAtom, usize> + HoleyVec<Binding>
by HashMap<VariableAtom, Rc<RefCell<Binding>>
and then it will do memory management in a more atomic way instead of allocating a big enough vector. After I implemented HoleyVec
I thought that it basically does memory management and only advantage before alloc is that size of the components is known. If the size of the vector is known in advance HoleyVec
should be more effective as memory is preallocated but otherwise may be alloc should do it even faster.
Also, the "capacity" function probably should take into account the inner Vec's unused capacity.
Probably capacity
is misleading name. I named it capacity
but I introduced it specifically to calculate the max index which were ever used to allocate bitmaps with flags. I will rename it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thanks for that explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed in e399189
@@ -12,6 +12,7 @@ env_logger = "0.8.4" | |||
directories = "5.0.1" # For Environment to find platform-specific config location | |||
smallvec = "1.10.0" | |||
im = "15.1.0" | |||
bitset = "0.1.2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bitset's implementation isn't very optimal, but it'll do for now. If it ever shows up as a bottleneck we have a lot of experience writing crazy-fast bit-set operations ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's largely going over my head, but it looks cool.
@@ -78,7 +78,7 @@ enum VarResolutionResult<T> { | |||
/// Abstraction of the variable set. It is used to allow passing both | |||
/// HashSet<&VariableAtom> and HashSet<VariableAtom> to the | |||
/// [Bindings::narrow_vars] method. | |||
pub trait VariableSet { | |||
pub trait VariableSet : Debug { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm no rust expert but is it normal to keep that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The style-police could come down either way on this.
The Rust stdlib tends to avoid traits bounding other traits when possible and instead puts the bounds on the impls / generics. But that can get a lot more verbose.
My personal feeling is that the compiler will do dead-code stripping on the whole binary for release, so if the code path isn't needed then it shouldn't bloat the binary. And knowing trait objects of that type also implement Debug is very convenient and doesn't hurt anything in practice.
Thanks Vitaly! |
Fixes #530. PR is a huge rework of
Bindings
implementation. New implementation is about 15% faster than previous one and logic is simpler. There are some minimal MeTTa tests which fails, the reason is the same as in #532 (comment), so I am going to work on this in separate PR.