Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EOSC Accounting #17

Merged
merged 32 commits into from
Aug 21, 2024
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
f3e25f3
Remove shebang
enolfc Jul 18, 2024
1b9cf89
Bring this to more modern packaging
enolfc Jul 18, 2024
20e52f8
Make version discoverable from git
enolfc Jul 18, 2024
4d328d8
Remove version
enolfc Jul 18, 2024
9b41733
Move deps outside
enolfc Jul 19, 2024
d97ee36
Better build
enolfc Jul 19, 2024
bb5eb63
Add flavor to the model
enolfc Jul 19, 2024
0092b11
Remove shebang
enolfc Jul 19, 2024
d4d935e
No executables needed here
enolfc Jul 19, 2024
fd98563
Minor linting fixes
enolfc Jul 22, 2024
ee537c2
Removed value (apparently not used?)
enolfc Jul 22, 2024
9e3141f
Fix the model
enolfc Jul 22, 2024
e0ef8fd
First attempt to get EOSC accounting
enolfc Jul 22, 2024
5cec580
Fix EOSC accounting config
enolfc Jul 22, 2024
21094ab
Introduce dry-run for testing
enolfc Jul 22, 2024
b4699f4
Remove shebang
enolfc Jul 22, 2024
184931a
Reorganise configuration
enolfc Jul 22, 2024
288bbe0
Adapt helm to new code
enolfc Jul 22, 2024
51dd95c
Allow for setting from/to dates
enolfc Jul 22, 2024
cba1332
Add shared volume to the eosc job
enolfc Jul 22, 2024
96c9c85
Self is not a thing here
enolfc Jul 22, 2024
c675c56
Add debug
enolfc Jul 22, 2024
1fae951
Linting fixes
enolfc Jul 22, 2024
295cec5
Linting
enolfc Jul 22, 2024
609809a
even more linting
enolfc Jul 22, 2024
5bdbe9b
Fix this again
enolfc Jul 22, 2024
e96cbc0
Add timestamp file for reporting
enolfc Jul 23, 2024
0efb1ff
Add timestamp to the helm chart
enolfc Jul 23, 2024
3f1e9af
add missing method parameters
andrea-manzi Jul 25, 2024
dcdc7a4
running black
andrea-manzi Jul 25, 2024
27c5614
fix linter error
andrea-manzi Jul 25, 2024
482039b
Report hours, not seconds
enolfc Jul 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
FROM python:3

COPY . /egi-notebooks-accounting
WORKDIR /egi-notebooks-accounting

RUN pip install --no-cache-dir -e /egi-notebooks-accounting/
COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

COPY . .

RUN pip install --no-cache-dir .
47 changes: 29 additions & 18 deletions egi_notebooks_accounting/config.ini
Original file line number Diff line number Diff line change
@@ -1,34 +1,45 @@
[prometheus]
[default]
# be verbose
# verbose=0
# site=EGI-NOTEBOOKS
# cloud_type=EGI Notebooks
# cloud_compute_service=
# default_vo=vo.notebooks.egi.eu
# fqan_key=primary_group

# url=http://localhost:8080/
# user=
# password=
# verbose=0
# verify=0
# outputs
# apel_spool=
# notebooks_db=

# filter=pod=~'jupyter-.*'
# range=4h

[VO]
#
# VO section example:
#
# [VO]
# VO1=group1,group2,...
# VO2=group3,group4,...

[prometheus]
# prometheus server URL
# url=http://localhost:8080/
# Server authentication
# user=
# password=
# do verify server
# verify=0
# filters for querying
# filter=pod=~'jupyter-.*'
# range=4h


[eosc]
# token_url=https://aai-demo.eosc-portal.eu/auth/realms/core/protocol/openid-connect/token
# argo_url=https://accounting.devel.argo.grnet.gr
# refresh_token=
# client_secret=
# AAI credentials (client_grant expected)
# token_url=
# client_id=
# client_secret=
# scopes=
# URL of the accounting service
# accounting_url=
# Installation that metrics are reported for
# installaion_id=
# users_metric=
# sessions_metric=

[eosc.flavors]
# add every flavor to be reported as follows
# flavor_name=metric_id
256 changes: 187 additions & 69 deletions egi_notebooks_accounting/eosc.py
100755 → 100644
Original file line number Diff line number Diff line change
@@ -1,127 +1,245 @@
#! /usr/bin/python3
"""EOSC EU Node Accounting implementation

EOSC EU Node expects aggregated accounting information for the number of hours
a given flavor of jupyter server has been running over the last day, following
this definition:

{
"metric_name": "small-environment-2-vcpu-4-gb-ram",
"metric_description": "total runtime per day (in hours)",
"metric_type": "aggregated",
"unit_type": "Hours/day"
}

The report is done by sending records with a POST API call to:
/accounting-system/installations/{installation_id}/metrics

with a JSON like:
{
"metric_definition_id": "<metric id (depends on the flavor)>”,
"time_period_start": "2023-01-05T09:13:07Z",
"time_period_end": "2024-01-05T09:13:07Z",
"value": 10.2,
"group_id": "group id", # personal or group project
"user_id": "user id" # user aai
}

This code goes to the accounting db and aggregates the information for the last 24 hours
and pushes it to the EOSC Accounting

Configuration:
[default]
notebooks_db=<notebooks db file>

[eosc]
token_url=https://proxy.staging.eosc-federation.eu/OIDC/token
client_secret=<client secret>
client_id=<client_id>
accounting_url=https://api.acc.staging.eosc.grnet.gr
installaion_id=<id of the installation to report accounting for>
timestamp_file=<file where the timestamp of the last run is kept>

[eosc.flavors]
# contains a list of flavors and metrics they are mapped to
<name of the flavor>=<metric id>
# example:
small-environment-2-vcpu-4-gb-ram=668bdd5988e1d617b217ecb9
"""

import argparse
import json
import logging
import os
import time
from configparser import ConfigParser
from datetime import datetime
from datetime import date, datetime, timedelta

import dateutil.parser
import pytz
import requests
from requests.auth import HTTPBasicAuth

from .prometheus import Prometheus
from .model import VM, db_init

PROM_CONFIG = "prometheus"
CONFIG = "default"
EOSC_CONFIG = "eosc"
FLAVOR_CONFIG = "eosc.flavors"
DEFAULT_CONFIG_FILE = "config.ini"
DEFAULT_FILTER = ""
DEFAULT_RANGE = "4h"
DEFAULT_TOKEN_URL = (
"https://aai-demo.eosc-portal.eu/auth/realms/core/protocol/openid-connect/token"
)
DEFAULT_ARGO_URL = "https://accounting.devel.argo.grnet.gr"
DEFAULT_TOKEN_URL = "https://proxy.staging.eosc-federation.eu/OIDC/token"
DEFAULT_ACCOUNTING_URL = "https://api.acc.staging.eosc.grnet.gr"
DEFAULT_TIMESTAMP_FILE = "eosc-accounting.timestamp"


def get_access_token(refresh_url, client_id, client_secret, refresh_token):
def get_access_token(token_url, client_id, client_secret):
response = requests.post(
refresh_url,
token_url,
auth=HTTPBasicAuth(client_id, client_secret),
data={
"grant_type": "refresh_token",
"refresh_token": refresh_token,
"scope": "openid email profile voperson_id eduperson_entitlement",
"grant_type": "client_credentials",
"scope": "openid email profile voperson_id entitlements",
"client_id": client_id,
"client_secret": client_secret,
},
)
return response.json()["access_token"]


def push_metric(argo_url, token, installation, metric, date_from, date_to, value):
data = {
"metric_definition_id": metric,
"time_period_start": date_from.strftime("%Y-%m-%dT%H:%M:%SZ"),
"time_period_end": date_to.strftime("%Y-%m-%dT%H:%M:%SZ"),
"value": value,
}
def push_metric(accounting_url, token, installation, metric_data):
logging.debug(f"Pushing to accounting - {installation}")
response = requests.post(
f"{argo_url}/accounting-system/installations/{installation}/metrics",
f"{accounting_url}/accounting-system/installations/{installation}/metrics",
headers={"Authorization": f"Bearer {token}"},
data=json.dumps(data),
data=json.dumps(metric_data),
)
response.raise_for_status()


def get_max_value(prom_response):
v = 0
for item in prom_response["data"]["result"]:
# just take max
v += max(int(r[1]) for r in item["values"])
return v
def update_pod_metric(pod, metrics, flavor_config):
if not pod.flavor or pod.flavor not in flavor_config:
# cannot report
logging.debug(f"Flavor {pod.flavor} does not have a configured metric")
return
user, group = (pod.global_user_name, pod.fqan)
user_metrics = metrics.get((user, group), {})
flavor_metric = flavor_config[pod.flavor]
flavor_metric_value = user_metrics.get(flavor_metric, 0)
user_metrics[flavor_metric] = flavor_metric_value + pod.wall
metrics[(user, group)] = user_metrics


def get_from_to_dates(args, timestamp_file):
from_date = None
if args.from_date:
from_date = dateutil.parser.parse(args.from_date)
else:
try:
with open(timestamp_file, "r") as tsf:
try:
from_date = dateutil.parser.parse(tsf.read())
from_date += timedelta(minutes=1)
except dateutil.parser.ParserError as e:
logging.debug(
f"Invalid timestamp content in '{timestamp_file}': {e}"
)
except OSError as e:
logging.debug(f"Not able to open timestamp file '{timestamp_file}': {e}")
# no date specified report from yesterday
if not from_date:
report_day = date.today() - timedelta(days=1)
from_date = datetime(
report_day.year, report_day.month, report_day.day, 0, 0
)
if args.to_date:
to_date = dateutil.parser.parse(args.to_date)
else:
# go until last minute of yesterday
report_day = date.today() - timedelta(days=1)
to_date = datetime(report_day.year, report_day.month, report_day.day, 23, 59)
utc = pytz.UTC
from_date = from_date.replace(tzinfo=utc)
to_date = to_date.replace(tzinfo=utc)
return from_date, to_date


def generate_day_metrics(
period_start, period_end, accounting_url, token, flavor_config, dry_run
):
andrea-manzi marked this conversation as resolved.
Show resolved Hide resolved
logging.info(f"Generate metrics from {period_start} to {period_end}")
metrics = {}
# pods ending in between the reporting times
for pod in VM.select().where(
(VM.end_time >= period_start) & (VM.end_time <= period_end)
):
update_pod_metric(pod, metrics, flavor_config)
# pods starting but not finished between the reporting times
for pod in VM.select().where(
(VM.start_time >= period_start) & (VM.end_time is None)
):
update_pod_metric(pod, metrics, flavor_config)
period_start_str = period_start.strftime("%Y-%m-%dT%H:%M:%SZ")
period_end_str = period_end.strftime("%Y-%m-%dT%H:%M:%SZ")
for (user, group), flavors in metrics.items():
for metric_key, value in flavors.items():
metric_data = {
"metric_definition_id": metric_key,
"time_period_start": period_start_str,
"time_period_end": period_end_str,
"user": user,
"group": group,
"value": value,
}
logging.debug(f"Sending metric {metric_data} to accounting")
if dry_run:
logging.debug("Dry run, not sending")
else:
push_metric(accounting_url, token, installation, metric_data)
if not dry_run:
try:
with open(timestamp_file, "w+") as tsf:
tsf.write(period_end.strftime("%Y-%m-%dT%H:%M:%SZ"))
except OSError as e:
logging.debug("Failed to write timestamp file '{timestamp_file}': {e}")


def main():
parser = argparse.ArgumentParser(
description="Kubernetes Prometheus metrics harvester"
)
parser = argparse.ArgumentParser(description="EOSC Accounting metric pusher")
parser.add_argument(
"-c", "--config", help="config file", default=DEFAULT_CONFIG_FILE
)
parser.add_argument(
"--dry-run", help="Do not actually send data, just report", action="store_true"
)
parser.add_argument("--from-date", help="Start date to report from")
parser.add_argument("--to-date", help="End date to report to")
args = parser.parse_args()

parser = ConfigParser()
parser.read(args.config)
prom_config = parser[PROM_CONFIG] if PROM_CONFIG in parser else {}
config = parser[CONFIG] if CONFIG in parser else {}
eosc_config = parser[EOSC_CONFIG] if EOSC_CONFIG in parser else {}
flavor_config = parser[FLAVOR_CONFIG] if FLAVOR_CONFIG in parser else {}
db_file = os.environ.get("NOTEBOOKS_DB", config.get("notebooks_db", None))
db_init(db_file)

verbose = os.environ.get("VERBOSE", prom_config.get("verbose", 0))
verbose = os.environ.get("VERBOSE", config.get("verbose", 0))
verbose = logging.DEBUG if verbose == "1" else logging.INFO
logging.basicConfig(level=verbose)
flt = os.environ.get("FILTER", prom_config.get("filter", DEFAULT_FILTER))
rng = os.environ.get("RANGE", prom_config.get("range", DEFAULT_RANGE))

# EOSC accounting config in a separate section
# EOSC accounting config
# AAI
token_url = os.environ.get(
"TOKEN_URL", eosc_config.get("token_url", DEFAULT_TOKEN_URL)
)
refresh_token = os.environ.get(
"REFRESH_TOKEN", eosc_config.get("refresh_token", "")
)
client_id = os.environ.get("CLIENT_ID", eosc_config.get("client_id", ""))
client_secret = os.environ.get(
"CLIENT_SECRET", eosc_config.get("client_secret", "")
)

# ARGO
argo_url = os.environ.get("ARGO_URL", eosc_config.get("argo_url", DEFAULT_ARGO_URL))
if args.dry_run:
logging.debug("Not getting credentials, dry-run")
token = None
else:
token = get_access_token(token_url, client_id, client_secret)

accounting_url = os.environ.get(
"ACCOUNTING_URL", eosc_config.get("accounting_url", DEFAULT_ACCOUNTING_URL)
)
installation = eosc_config.get("installation_id", "")
users_metric = eosc_config.get("users_metric", "")
sessions_metric = eosc_config.get("sessions_metric", "")

prom = Prometheus(parser)
tnow = time.time()
data = {
"time": tnow,
}

# ==== number of users ====
data["query"] = "jupyterhub_total_users{" + flt + "}[" + rng + "]"
users = get_max_value(prom.query(data))

# ==== number of sessions ====
data["query"] = "jupyterhub_running_servers{" + flt + "}[" + rng + "]"
sessions = get_max_value(prom.query(data))

# now push values to EOSC accounting
to_date = datetime.utcfromtimestamp(tnow)
from_date = to_date - prom.parse_range(rng)
token = get_access_token(token_url, client_id, client_secret, refresh_token)
push_metric(argo_url, token, installation, users_metric, from_date, to_date, users)
push_metric(
argo_url, token, installation, sessions_metric, from_date, to_date, sessions

timestamp_file = os.environ.get(
"TIMESTAMP_FILE", eosc_config.get("timestamp_file", DEFAULT_TIMESTAMP_FILE)
)

# ==== queries ====
from_date, to_date = get_from_to_dates(args, timestamp_file)
logging.debug(f"Reporting from {from_date} to {to_date}")
# repeat in 24 hour intervals
period_start = from_date
while period_start <= to_date:
period_end = period_start + timedelta(hours=23, minutes=59)
generate_day_metrics(
period_start, period_end, accounting_url, token, flavor_config, args.dry_run
)
period_start = period_end + timedelta(minutes=1)


if __name__ == "__main__":
main()
Loading
Loading