-
-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store the resources in S3 buckets (data file and revisions offset files) #582
Comments
Hi @JohannesLichtenberger I would like to work on this issue. |
@sudip-unb did you make any advances or do you need help? |
Hi @JohannesLichtenberger please give me a little bit time. I am currently busy for my exam. I already explored the link you have provided me. I can start development from this weekend. |
Sure, just trying to ping assigned people because usually it's likely that they do not have time at all in my experience ;-) but glad that you're starting to work on it afterward. Take your time -- as said, I just wanted to make sure that you're still up for the task :-) |
@sudip-unb any news? :) |
Hi can I help out with this issue? |
@Yashendr you can have a look at Treetank (https://github.com/sebastiangraf/treetank) where Sebastian already implemented such a backend (also a combined storage)... |
I went through the org.sirix.io package to understand the storage types that SirixDB has currently.
In approach 2 I could refer/use the jclouds package you have mentioned Let me know what you think about these approaches. |
I'd probably see it as a kind of automatic backup. A local file based store and an async store via JClouds. IIRC the pure S3 storage was way too slow. So, in short I prefer your second option. To combine the storage approaches we can implement something as simple as this combined storage: https://github.com/sebastiangraf/treetank/tree/master/coremodules/core/src/main/java/org/treetank/io/combined BTW: If you dig a bit deeper into the storage mechanism (simply store word aligned page fragments instead of same sized full pages), it would also be interesting to find out, why the iouring based backend currently is slower than the simple file channel based solution and the memory mapped backend (I think somehow because of the event loop)... |
Ok looking at the above for combined, here is my understanding of the requirement:
|
|
Thanks for working on this :-) and probably the upcoming task |
@sband Hey do you need help with this. Do you have anything in particular you would like me to do? |
@Yashendr sure I will let you know if I need any help around this. |
This is in progress. I am hoping to complete this by coming friday... |
quick question:
|
Hi @sband, I'd simply read-write the page(-fragments) into S3 buckets. If we want a local cache and/or use S3 as a backup more or less I'd use the AWS is okay :-) in the future, we could also support for instance writing to/reading from Kafka or Pulsar/BookKeeper... However, what I'm even more interested in is making the local storage first of all as fast as possible before even using horizontal scaling/sharding... so I'd be rather interested why the IO-uring storage is currently on my Notebook at least slower in comparison to the FileChannel based approach. |
Furthermore, it's kind of sad that Intel Optane Non-Volatile Memory isn't produced anymore, as the page(-fragments) are not aligned to a predefined size, and thus, sometimes if only a few nodes are changed due to the sliding snapshot algorithm only mainly these nodes are written to a new location instead of the full page (thus generating a page-fragment). However, for iouring, I guess it would be great to have classes of page sizes and to use predefined buffers (as in Umbra from Thomas Neumann...). |
Thanks for working on SirixDB, BTW :-) really looking forward to your PR (and maybe future contributions?) |
I have created a DRAFT pull request for this #611 |
Hi, |
@ighosh98 do you intend to work on this? |
Hi @JohannesLichtenberger , yes I will be working on this issue. |
@ighosh98 ping |
Hi. I've been tied up with some work. It would take me some time to raise the PR. If someone else can develop it faster, they can take over. |
As an alternative backend or a combination of a local store and an S3 bucket store we could add a new storage type using JClouds blob store for instance (as in Treetank, the former project from which this project was forked).
The text was updated successfully, but these errors were encountered: