Skip to content

Commit

Permalink
Merge pull request 2i2c-org#4675 from consideRatio/pr/aws-ce-grafana-…
Browse files Browse the repository at this point in the history
…backend

Add aws-ce-grafana-backend chart, terraform AWS IAM, and POC Python code: for cost monitoring in Grafana via AWS Cost Explorer API
  • Loading branch information
consideRatio authored Aug 26, 2024
2 parents 9d4d2a0 + 82e3ef2 commit 5e10335
Show file tree
Hide file tree
Showing 30 changed files with 1,103 additions and 5 deletions.
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ repos:
hooks:
- id: sops-encryption
# Add files here if they contain the word 'secret' but should not be encrypted
exclude: secrets\.md|helm-charts/support/templates/prometheus-ingres-auth/secret\.yaml|helm-charts/basehub/templates/dex/secret\.yaml|helm-charts/basehub/templates/static/secret\.yaml|config/clusters/templates/common/support\.secret\.values\.yaml|helm-charts/basehub/templates/ingress-auth/secret\.yaml
exclude: secrets\.md|helm-charts/support/templates/prometheus-ingres-auth/secret\.yaml|helm-charts/basehub/templates/dex/secret\.yaml|helm-charts/basehub/templates/static/secret\.yaml|config/clusters/templates/common/support\.secret\.values\.yaml|helm-charts/basehub/templates/ingress-auth/secret\.yaml|helm-charts/aws-ce-grafana-backend/templates/secret\.yaml

# Prevent known typos from being committed
- repo: https://github.com/codespell-project/codespell
Expand Down
4 changes: 4 additions & 0 deletions deployer/commands/validate/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,10 @@ def _prepare_helm_charts_dependencies_and_schemas():
_generate_values_schema_json(support_dir)
subprocess.check_call(["helm", "dep", "up", support_dir])

aws_ce_grafana_backend = HELM_CHARTS_DIR.joinpath("aws-ce-grafana-backend")
_generate_values_schema_json(aws_ce_grafana_backend)
subprocess.check_call(["helm", "dep", "up", aws_ce_grafana_backend])


def get_list_of_hubs_to_operate_on(cluster_name, hub_name):
config_file_path = find_absolute_path_to_cluster_file(cluster_name)
Expand Down
33 changes: 33 additions & 0 deletions helm-charts/aws-ce-grafana-backend/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Anything within the root folder of the Helm chart, where Chart.yaml resides,
# will be embedded into the packaged Helm chart. This is reasonable since only
# when the templates render after the chart has been packaged and distributed,
# will the templates logic evaluate that determines if other files were
# referenced, such as our our files/hub/jupyterhub_config.py.
#
# Here are files that we intentionally ignore to avoid them being packaged,
# because we don't want to reference them from our templates anyhow.
values.schema.yaml

# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
17 changes: 17 additions & 0 deletions helm-charts/aws-ce-grafana-backend/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Chart.yaml v2 reference: https://helm.sh/docs/topics/charts/#the-chartyaml-file
apiVersion: v2
name: aws-ce-grafana-backend
version: "0.0.1-set.by.chartpress"
appVersion: "1.0.0"
description:
A intermediate backend serving JSON from AWS Cost Explorer API, for use
by Grafana dashboard panels via the Infinity datasource plugin to present AWS cloud
costs.
keywords: [aws, cost explorer, grafana, infinity]
home: https://github.com/2i2c-org/aws-ce-grafana-backend
sources: [https://github.com/2i2c-org/aws-ce-grafana-backend]
# icon:
kubeVersion: ">=1.28.0-0"
maintainers:
- name: Erik Sundell
email: [email protected]
4 changes: 4 additions & 0 deletions helm-charts/aws-ce-grafana-backend/ce-test-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
fullnameOverride: ce-test
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::783616723547:role/aws_ce_grafana_backend_iam_role
69 changes: 69 additions & 0 deletions helm-charts/aws-ce-grafana-backend/mounted-files/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# About code files

The code is meant to help serve grafana with JSON with cost related data,
initially only from AWS.

## De-coupled from other k8s services

This software doesn't rely to other k8s services, so it can deploy and be tested
by itself.

## Bundling into Dockerfile vs. mounting in Helm chart

By mounting the code files, development iterations running the code in k8s
becomes faster.

## Development

### Testing Python changes locally

First authenticate yourself against the AWS openscapes account.

```bash
cd helm-charts/aws-ce-grafana-backend/mounted-files
python -m flask --app=webserver run --port=8080

# visit http://localhost:8080/aws
```

### Testing Python changes in k8s

This is currently being developed in the openscapes cluster. It depends on a k8s
ServiceAccount coupled to an IAM Role there as well.

The image shouldn't need to be rebuilt unless additional dependencies needs to
be installed etc, so if you've only made code changes, you can do the following
to re-deploy.

```bash
deployer use-cluster-credentials openscapes

cd helm-charts/aws-ce-grafana-backend
helm upgrade --install --create-namespace -n ce-test --values ce-test-config.yaml ce-test .

# note that port-forward to a service is just a way to port-forward to a pod
# behind the service, so you need to do the port-forwarding again if the pod
# restarts.
kubectl port-forward -n ce-test service/ce-test 8080:http

# visit http://localhost:8080/aws
```

### Testing image changes in k8s

```bash

cd helm-charts

# before doing this: commit the image change, and stash other changes
# git status should not report anything
chartpress --push

# commit the updated image tag
git add aws-ce-grafana-backend/values.yaml
git commit -m "aws-ce-grafana-backend chart: update image to deploy"

# WARNING: cleanup of uncommitted files, should be ok if your git status was
# clean before running chartpress --push
git reset --hard HEAD
```
Empty file.
164 changes: 164 additions & 0 deletions helm-charts/aws-ce-grafana-backend/mounted-files/aws.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
import boto3

# AWS client functions most likely:
#
# - get_cost_and_usage
# - get_cost_categories
# - get_tags
# - list_cost_allocation_tags
#


def query_aws_cost_explorer():
aws_ce_client = boto3.client("ce")

# ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ce/client/get_cost_and_usage.html#get-cost-and-usage
response = aws_ce_client.get_cost_and_usage(
Metrics=["UnblendedCost"],
Granularity="DAILY",
TimePeriod={
"Start": "2024-07-01",
"End": "2024-08-01",
},
Filter={
"Dimensions": {
# RECORD_TYPE is also called Charge type. By filtering on this
# we avoid results related to credits, tax, etc.
"Key": "RECORD_TYPE",
"Values": ["Usage"],
},
},
GroupBy=[
{
"Type": "DIMENSION",
"Key": "SERVICE",
},
],
)
return response["ResultsByTime"]


# Granularity:
#
# - HOURLY, DAILY, or MONTHLY
#
# - Hourly resolution is only available for the last two days, so we'll use a
# daily resolution which is available for the last 13 months.
#
#
#
# Metrics:
#
# - Valid choices:
# - AmortizedCost
# - BlendedCosts
# - NetAmortizedCost
# - NetUnblendedCost
# - NormalizedUsageAmount
# - UnblendedCosts
# - UsageQuantity
#
# - UnblendedCosts is the default metric presented in the web console, it
# represents costs for an individual AWS account. When combining costs in an
# organization, 1 + 1 <= 2, because the accounts cumulative use can reduce
# rates.
#
# - We'll focus on UnblendedCosts though, because makes the service cost
# decoupled from other cloud accounts usage.
#
# Filter:
#
# - RECORD_TYPE is what's named Charge type in the web console, and looking at
# "Usage" only that helps us avoid responses related to credits, tax, etc.
#
# - Dimensions:
# - AZ
# - INSTANCE_TYPE
# - LINKED_ACCOUNT
# - LINKED_ACCOUNT_NAME
# - OPERATION
# - PURCHASE_TYPE
# - REGION
# - SERVICE
# - SERVICE_CODE
# - USAGE_TYPE
# - USAGE_TYPE_GROUP
# - RECORD_TYPE
# - OPERATING_SYSTEM
# - TENANCY
# - SCOPE
# - PLATFORM
# - SUBSCRIPTION_ID
# - LEGAL_ENTITY_NAME
# - DEPLOYMENT_OPTION
# - DATABASE_ENGINE
# - CACHE_ENGINE
# - INSTANCE_TYPE_FAMILY
# - BILLING_ENTITY
# - RESERVATION_ID
# - RESOURCE_ID (available only for the last 14 days of usage)
# - RIGHTSIZING_TYPE
# - SAVINGS_PLANS_TYPE
# - SAVINGS_PLAN_ARN
# - PAYMENT_OPTION
# - AGREEMENT_END_DATE_TIME_AFTER
# - AGREEMENT_END_DATE_TIME_BEFORE
# - INVOICING_ENTITY
# - ANOMALY_TOTAL_IMPACT_ABSOLUTE
# - ANOMALY_TOTAL_IMPACT_PERCENTAGE
# - Tags:
# - Refers to Cost Allocation Tags.
# - CostCategories:
# - Can include Cost Allocation Tags, but also references various services
# etc.
#
# GroupBy
#
# - Can be an array with up to two string elements, being either:
# - DIMENSION
# - TAG
# - COST_CATEGORY
#

# Description of Grafana panels wanted by Yuvi:
# ref: https://github.com/2i2c-org/infrastructure/issues/4453#issuecomment-2298076415
#
# Currently our AWS tag 2i2c:hub-name is only capturing a fraction of the costs,
# so initially only the following panels are easy to work on.
#
# - total cost (4)
# - total cost per component (2)
#
# The following panels are dependent on the 2i2c:hub-name tag though.
#
# - total cost per hub (1)
# - total cost per component, repeated per hub (3)
#
# Summarized notes about user facing labels:
#
# - fixed:
# - core nodepool
# - any PV needed for support chart or hub databases
# - Kubernetes master API
# - load balancer services
# - compute:
# - disks
# - networking
# - gpus
# - home storage:
# - backups
# - object storage:
# - tagged buckets
# - not counting requester pays
# - total:
# - all 2i2c managed infra
#
# Working against cost tags directly or cost categories
#
# Cost categories vs Cost allocation tags
#
# - It seems cost categories could be suitable to group misc data under
# categories, and split things like core node pool.
# - I think its worth exploring if we could offload all complexity about user
# facing labels etc by using cost categories to group and label costs.
#
20 changes: 20 additions & 0 deletions helm-charts/aws-ce-grafana-backend/mounted-files/webserver.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from flask import Flask

from .aws import query_aws_cost_explorer

app = Flask(__name__)


@app.route("/")
def hello_world():
return "<p>Hello, World!</p>"


@app.route("/health/ready")
def ready():
return ("", 204)


@app.route("/aws")
def aws():
return query_aws_cost_explorer()
Loading

0 comments on commit 5e10335

Please sign in to comment.