Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaling-out comparison with Hail #39

Open
jeromekelleher opened this issue Sep 18, 2023 · 0 comments
Open

Scaling-out comparison with Hail #39

jeromekelleher opened this issue Sep 18, 2023 · 0 comments

Comments

@jeromekelleher
Copy link
Collaborator

We have a pretty good story on scaling up computation within a single server in our current "scaling properties". However, we don't have any concrete data on how we scale out. I think we do need one example of a computation that can be done on real data, and how it scales out.

We could do something like PCAs on 1000G data (in the cloud)? It doesn't have to be extensive, we just have to show that we can scale out, and that we're competitive with Hail.

We can refer to Liangde's Thesis (which is definitely citable) for a thorough comparison (which basically says sgkit is comparable with Hail, with a few caveats I think).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants