-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RRB Trees: Efficient Immutable Vectors #72
Comments
I've been working on an implementation of RRB Trees (I call them RRBVectors in my implementation) in a private Bitbucket repo for a while now, and it's getting close to completion. I've started cleaning it up, removing duplicate code, and fleshing out the API to get it ready to add to the FSharpx.Collections. Once it's fully ready (which might take another couple months since I only have sporadic free time to work on it), I'll submit it as a PR. There are a few implementation details, though, that are worth discussing before the code is ready because they would require a few changes to PersistentVector:
|
I want to clean up the code some more before I submit a PR, but anyone interested in seeing my first pass at an implementation can start looking at https://github.com/rmunn/FSharpx.Collections/tree/rrb-vector. A couple of concerns I need to discuss with other people before merging this in:
But I'm ready for other people to look through my code so far and review it. The main code is found in https://github.com/rmunn/FSharpx.Collections/blob/rrb-vector/src/FSharpx.Collections.Experimental/RRBVector.fs and the main tests are found in https://github.com/rmunn/FSharpx.Collections/blob/rrb-vector/tests/ExpectoRRB/RRBVectorExpectoTest.fs. |
@rmunn Is there any way I can help you? Would be great to merge it someday. :) |
merged #87 |
Thanks @forki! I'll see what I can do about getting the RRB vector code into a mergeable state now. |
Just a friendly reminder. ;) |
@rmunn are you still interested in submitting a PR to Collections.Experimental? I'm hoping to get a 2.0 release out in the next few weeks. |
@jackfoxy I am still interested, but I've had practically zero time to work on coding the past couple of months. I do have the code into a better state than it was back in March, but I think it needs more work before it's mergeable. OTOH, the right answer to "I have zero time" is probably to submit a PR in its current state, and let other people do any remaining work that's lacking. The two biggest issues with the code as I have it right now are:
If an FParsec build-time dependency is acceptable, then my code is already mergeable and I can get a PR in the next few days. |
@rmunn I think this is all fine for submission to Collections.Experimental, which is after all "experimental". |
@jackfoxy I've had a little bit more time to work on RRBVector, and discovered that my code isn't quite ready to release yet: my most recent change (implementing transients so that things like
Please don't hold up the FSharpx.Collections release on my account, since it might be another month or two before I truly have the code ready to go. (Without going into details of my personal life, free time is a vanishingly rare commodity for me right now, and my estimate of "a PR in the next few days" was wildly optimistic). But for the sake of @gsomix and anyone else waiting to see the code, I'll definitely get it up in its own repo and start releasing NuGet packages of the work-in-progress. |
@rmunn No worries. At a minimum I want to document all the main Collections data structures and fix know issues in the main Collection before releasing, and I haven't touched it in a couple of weeks. |
Friendly ping here :) |
I'm getting rather close to having something releasable. Right now I'm polishing up my tests for 100% code coverage, to make sure I don't have any lurking bugs in this rather complex piece of code. (I thought I had that licked last month, but just last week I found and fixed more bugs). I'll post another comment in this thread when it's ready for alpha testing. |
After far more bugfixing than I had anticipated, I've finally got the last bug (that I know of) licked, so I'm ready to make my RRBVector implementation public! To enable me to more easily respond quickly to bug reports and feature requests, I've created a separate GitHub project named Ficus and created a v0.0.1 NuGet package for my implementation. Anyone interested in alpha-testing my 0.0.1 release, please submit bug reports and/or suggestions for API improvements in the Ficus issue tracker. Once I'm confident that the code is not only bug-free, but also has an API that's useful to people, I intend to submit it as a PR to FSharpx.Collections. My plan moving forward is to use Ficus as a sort of "staging" repo for my work on the RRBVector implementation, and submit any major improvements to FSharpx.Collections. Basically, SemVer patch releases will stay in the Ficus project, but any releases where I bump the SemVer minor version number (and certainly any releases where I bump the major version number) I'll also submit as a PR to FSharpx.Collections. |
I propose to add an implementation of Bagwell and Rompf's RRB-Tree Vectors (Relaxed Radix-Balanced Trees) to FSharpx.Collections. They are similar to PersistentVector, but allow for efficient slicing and concatenation. Slicing an RRB vector is effectively constant-time since it takes O(log32 N) time. Concatenating two RRB Vectors together is also O(log32 N) but with a large constant multiplier, as much as 1024 in the worst case, so it should really be considered O(log N) for all practical purposes.
Clojure implementation of RRB vectors: https://github.com/clojure/core.rrb-vector
Scala implementation: https://github.com/nicolasstucki/scala-rrb-vector
Papers about RRB vectors:
The text was updated successfully, but these errors were encountered: