-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Platform Initiative] Hub Scale Cost Monitoring for AWS #4384
Comments
I've considered OpenCost, and discarded it as not being able to satisfy our needs. |
Just reading the Cloudbank ACM paper and they mentioned a closed source solution called Nutanix BEAM. Are we specifically leaning into open-source solutions here? |
Has anyone checked out OpenCost? It's based on Prometheus. It looks like it might have potential for dedicated clusters at least. |
@Gman0909 see yuvi's comment above 😆 |
D'oh. |
@Gman0909 and @haroldcampbell will have an offline conversation about how we refine the tasks associated with this (particularly how we get unblocked). There are a lot of unknowns even after spikes (see #4453). |
Current update: we have a replicable solution for AWS; what's next? Need a definition of done & would like a showcase - James, Jenny & Jim would like to be able to use this, get feedback from Openscapes and be able to show/tell others how to use. Note from @yuvipanda the Openscapes folks are giving feedback via an informal showcase every week, and metrics we chose were determine what features to build. from @Gman0909: https://grafana.openscapes.2i2c.cloud/d/edw06h7udjwg0b/cloud-cost-attribution?orgId=1&from=now%2FfQ&to=now%2FfQ |
Action to take next:
|
I've now updated the top comment to contain all the remaining open issues related to this initiative and closed the parent epic issue. There is still #4872 (comment) that isn't captured yet. If this was a feature request that @Gman0909 or @jnywong has more context about, do you mind opening an issue about it or sharing the context here and I'll open the issue later. Thank you! |
Thanks @GeorgianaElena ! I don't have context for that comment – would be useful to record this as an insight on ProductBoard @consideRatio if you haven't already 👍 |
The https://github.com/2i2c-org/meta/issues/1511#issuecomment-2393887518 captures the context best for what I meant in #4872 (comment), its was a feature that Tasha Snow observed to be relevant and missing. @Gman0909 summarized this:
I figure what this means technically to implement this, is that we would add a new panel in the dashboard, using data filtered on "compute" costs separate hubs, and then group by a new AWS tag such as It isn't obvious it should be implemented just because it was found relevant, but I wanted to help ensure we don't forget about this observation so a decision can be made to go or not go for it. |
The task list has been converted to sub issues for better tracking. @GeorgianaElena and @sgibson91, the original intent was to try and get this wrapped up by the end of the sprint, are you confident we're on track to achieve that? |
This wasn't a planning error, but @yuvipanda and me deciding to hold off doing the EFS split until we saw how the EBS migration was going. Given that is happening ok, I've closed the #5077 as the EBS migrations that will happen as part of #5010 will supersede the EFS split work.
Yes, we're on track |
@GeorgianaElena At the minute, #5010 does not track rolling out EBS to all AWS hubs - only nmfs-openscapes, cryocloud and veda |
I guess I'm just raising that there is now a dependency between this initiative and #5010 and I don't know how that affects the Definition of Done here. From my perspective, the EBS work will not be done by the end of this sprint. I don't even intend to start work on the next community until next sprint. So that does mean that there won't be complete cost info in grafana for all AWS hubs by the end of this sprint. Maybe we're fine with that and we just adjust the DoD accordingly? |
This initiative was considered complete even without the nodegroup and homedirs split. So it is my understanding that we are fine with the current plan. However, because I wasn't involved throughout the entire timeline of the initiative I will leave @yuvipanda to chime in if additional historical context is needed. |
Since all the sub-issues of this initiative have been wrapped up, I will close this issue 🎉 Thank you all for all the work on this one! ❤ |
Thanks so much to everyone who contributed to this massive effort! |
Productboard link: https://2i2c.productboard.com/roadmap/7947557-2i2c-roadmap/features/26823459
Description
Institutional leads, as well as department leads in large organizations, need to be able to justify their budgets and ensure they are being spent with value in mind. Business intelligence depends on data, and we want to make sure we build towards a data reporting infrastructure that can make a hub or constellation of hubs' usage and cost more transparent, enabling better decision making come budget time, as well as offering a sense of security and transparency over a service that is often perceived as being a high risk for cost overruns.
To that end, we would like to give community and institutional leaders the ability to monitor the cost and usage of a hub, or groups of hubs, provided by 2i2c.
The solution should provide a dashboard that automatically updates to reflect up to date aggregated costs and usage reports for each hub in a constellation, or the single hub an administrator has admin rights over. Data should be able to be exportable in the form of reports.
Additionally, we should investigate adding an option to share the dashboard with individuals outside of those with administrative privileges.
Typical use cases:
Scope
We already have a document listing the things cloud providers charge you for. The things we care about, in priority order, are:
Each of these costs should be attributable to either:
Attributing to individual users, or specific subgroups inside a hub, are out of scope.
Definition of Done
Admins of any 2i2c hub can access dashboards and reports where they can monitor up-to-date cost information for their hubs, and export reports with that same information.
The text was updated successfully, but these errors were encountered: