Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I had some thoughts about this #1

Open
crashfrog opened this issue Aug 30, 2017 · 1 comment
Open

I had some thoughts about this #1

crashfrog opened this issue Aug 30, 2017 · 1 comment

Comments

@crashfrog
Copy link
Member

crashfrog commented Aug 30, 2017

Nothing too congealed, as of yet, but I think a powerful tool for users of database-dependent tools, developers of those tools, and curators of those databases might have features as follows:

  1. management of downloaded/installed databases via a system daemon, akin perhaps to the Docker daemon, to promote a consistent interface into the management and retrieval of databases;

  2. the daemon able to report useful information about the version and change history of the database, and restore a database to an earlier version on demand so that earlier analyses can be fully repeated;

  3. API's and language bindings in Java, Python, Perl, and C (stuff like Protocol Buffers makes this somewhat easier) allow developers to add functionality to interrogate the daemon (if necessary) to resolve references to the necessary databases in their bioinformatics tools and pipelines;

  4. Distributed storage of databases via IPFS, perhaps, to prevent traffic and bandwidth bottlenecks across cluster environments and elsewhere, with the daemon perhaps able to make intelligent decisions about which databases are hosted locally and which are pulled from the distributed network on demand;

  5. a secure, content-based addressing system so that the same system can distribute open and closed databases, and that integrity of the data can be assured

Right now I imagine a system that's a bit like a mash-up of Git and the user experience of Docker, but for big databases instead of containers. Running on top of IPFS maybe to handle distribution.

@crashfrog
Copy link
Member Author

I propose the working name 'beryl' for this project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant