Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two tree copies for parallel reading & writing #101

Open
bitbanger opened this issue Jun 23, 2023 · 0 comments
Open

Two tree copies for parallel reading & writing #101

bitbanger opened this issue Jun 23, 2023 · 0 comments

Comments

@bitbanger
Copy link

Lane Lawley
4:46 PM
Prediction is separate from training, right?

Chris MacLellan
4:46 PM
and they intersect and deadlock
4:46
well probably
4:46
but not necessarily
4:47
you might queue up a bunch of training and then do some categorization while training is happening

Lane Lawley
4:47 PM
Right, but what I mean is
4:47
What if the server always had a "last post-insertion snapshot"
4:47
That could not be written to
4:47
But was free to use for infinitely parallel reads

Chris MacLellan
4:48 PM
We should talk about some kind of time based reading like that
4:48
I was thinking about multi-layer cobweb where later layers are dynamically computing the basic level of nodes as categorization is happening
4:48
we probably do want to be able to read/write at the same time
4:48
or insert and predict simultaneously

Lane Lawley
4:48 PM
Our goal isn't to have a single Cobweb tree that's parallelized and locked for all its use cases
4:49
Our goal is to build Cobweb trees faster
4:49
So I don't think we necessarily need to try and fit all of our operations into the concurrency model
4:49
Just writes

Chris MacLellan
4:49 PM
Fair enough, I think that makes sense for now, we can do something where for prediction.. the whole tree readlock, so you'd have to wait for insertions to finish before being able to make a prediction

Lane Lawley
4:50 PM
Yeah, although that would prevent more writes from happening in the meantime

Chris MacLellan
4:50 PM
or something like there, where we basically assume you wont be training and testing at the same time
4:50
yes, the solution we probably want it something like you described

Lane Lawley
4:50 PM
You can either read from the tree C_n and block writes, or copy C_n and read while new trees are being built

Chris MacLellan
4:50 PM
time based read only

Lane Lawley
4:51 PM
It's not a concern that your predictions won't be on the new tree because they wouldn't have been anyway if you'd locked writes

Chris MacLellan
4:51 PM
you can categorize and insert at the same time
4:51
no problem there, but you can't predict from a node
4:51
the issue is when you walk up the tree and insersect with something coming down the tree
4:51
I was thinking there might be a simple solution to do locks from the top down for the prediction

Lane Lawley
4:51 PM
That wouldn't happen

Chris MacLellan
4:51 PM
but 🤷

Lane Lawley
4:52 PM
Think of it in the reverse way. You do lock the entire tree for reads, but then as writes queue up and wait for the read to finish, the writes are actually given express tickets to be written to a new copy
4:52
And when the reads are over, that new copy becomes the main tree
4:52
It's functionally similar to that

Chris MacLellan
4:53 PM
ah, but wouldn't that require us to copy the tree?

Lane Lawley
4:53 PM
Yes, but it's incremental; you only have to copy deltas

Chris MacLellan
4:53 PM
a tree delta

Lane Lawley
4:53 PM
Yes

Chris MacLellan
4:53 PM
I like this idea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant