From 9a92345bb8e0052aca4f4f64d2ee1ad7879afd4e Mon Sep 17 00:00:00 2001 From: Manfred Cheung Date: Thu, 1 Aug 2024 19:34:48 -0400 Subject: [PATCH 1/5] add documentation for python endpoint env var --- docs/configure.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/configure.md b/docs/configure.md index e966857..5c97ddf 100644 --- a/docs/configure.md +++ b/docs/configure.md @@ -38,6 +38,15 @@ Recommendations for SSO when self-hosting: * Disallow non-SSO account creation * Decide whether SSO users can automatically join organizations without an invitation +## Features + +Certain Graphistry features can be enabled or disabled through environment variables to ensure that these features cannot be accessed without explicit permission from an administrator regardless of in-app permissions configuration. In all cases, access can also be more finely controlled using waffle flags set within Graphistry. All feature configuration is done through the file `data/config/custom.env`. Each feature listed below is assigned a corresponding environment variable name. Environment variables can be set to either `true` or `false` to enable or disable the associated feature. An unset environment variable associated with a given feature is treated as `false`. + +### Python Endpoint + +The Python endpoint allows any user granted access a way to retrieve datasets stored within Graphistry and process it using unrestricted arbitrary user provided Python code. This Python code can include a limited set of external libraries such as `numpy` and `cudf`, in addition to `graphistry`, and can access all computational resources available to the forge-etl-python server including GPU compute. The result is returned to the user as a string or as JSON. + +It is recommended that an administrator checks to ensure that only trusted users are provided access via the `flag_python_endpoint` waffle flag before the `ENABLE_PYTHON_ENDPOINT` environment variable is set to `true`. ## TLS From 5a981508d0ba7c16b64a02eb4a09bcb541f3d50a Mon Sep 17 00:00:00 2001 From: lmeyerov Date: Sun, 27 Oct 2024 17:10:08 -0700 Subject: [PATCH 2/5] docs(python): update --- docs/configure-python.md | 28 ++++++++++++++++++++++++++++ docs/configure.md | 16 ++++------------ 2 files changed, 32 insertions(+), 12 deletions(-) create mode 100644 docs/configure-python.md diff --git a/docs/configure-python.md b/docs/configure-python.md new file mode 100644 index 0000000..13b3273 --- /dev/null +++ b/docs/configure-python.md @@ -0,0 +1,28 @@ +# Configure the Python Endpoing + +The Python endpoint allows any user granted access a way to retrieve datasets stored within Graphistry and process it using unrestricted arbitrary user provided Python code. This Python code can include a limited set of external libraries such as `numpy` and `cudf`, in addition to `graphistry`, and can access all computational resources available to the forge-etl-python server including GPU compute. The result is returned to the user as a string or as JSON. + +## Safe defaults + +* Graphistry Hub: The Python endpoint is not available to Graphistry Hub users at this time + +* Graphistry Enterprise: The Python endpoint must be explicitly turned on for regular Graphistry Enterprise users + +The more restricted GFQL endpoint is default-on for both Graphistry Hub and Graphistry Enterprise + +## Toggling + +The endpoint must be both on in general, and individual user types explicitly allowed: + +1. Enable access to individual users via the Graphistry admin panel's feature flag area + +1. The flag must also be enabled at the system-level via the `ENABLE_PYTHON_ENDPOINT` environment variable in `data/config/custom.env` + +We recommend checking individual user access before enabling the endpoint. + +## Further reading + +See also: + +* The [Graphistry REST API for the Python endpoint](https://hub.graphistry.com/docs/Python/python-api/) +* The [Graphistry REST API for the GFQL endpoint](https://hub.graphistry.com/docs/GFQL/gfql-api/) diff --git a/docs/configure.md b/docs/configure.md index 5c97ddf..eccfc57 100644 --- a/docs/configure.md +++ b/docs/configure.md @@ -15,10 +15,12 @@ See [user creation docs](user-creation.md) ## Top configuration places: data/config/custom.env, data/pivot-db/config/config.json -* Graphistry is primarily configured through file `data/config/custom.env` +* Graphistry is primarily configured by setting values in `data/config/custom.env` * Connector, ontology, and pivot configuration is optionally via `data/pivot-db/config/config.json`. Many relevant options are [detailed in a reference page](configure-investigation.md). -Between edits, restart one or all Graphistry services: `docker-compose stop` and `docker-compose up -d` +Between edits, restart one or all Graphistry services: `docker compose stop` and `docker compose up -d`. + +We typically recommend doing targeted and localized restarts via `docker compose stop service1 service2 ...` and `docker compose up -d --force-recreate --no-deps service1 service2 ...`. Contact staff for guidance. ## Further configuration: docker-compose.yml and Caddyfile @@ -38,16 +40,6 @@ Recommendations for SSO when self-hosting: * Disallow non-SSO account creation * Decide whether SSO users can automatically join organizations without an invitation -## Features - -Certain Graphistry features can be enabled or disabled through environment variables to ensure that these features cannot be accessed without explicit permission from an administrator regardless of in-app permissions configuration. In all cases, access can also be more finely controlled using waffle flags set within Graphistry. All feature configuration is done through the file `data/config/custom.env`. Each feature listed below is assigned a corresponding environment variable name. Environment variables can be set to either `true` or `false` to enable or disable the associated feature. An unset environment variable associated with a given feature is treated as `false`. - -### Python Endpoint - -The Python endpoint allows any user granted access a way to retrieve datasets stored within Graphistry and process it using unrestricted arbitrary user provided Python code. This Python code can include a limited set of external libraries such as `numpy` and `cudf`, in addition to `graphistry`, and can access all computational resources available to the forge-etl-python server including GPU compute. The result is returned to the user as a string or as JSON. - -It is recommended that an administrator checks to ensure that only trusted users are provided access via the `flag_python_endpoint` waffle flag before the `ENABLE_PYTHON_ENDPOINT` environment variable is set to `true`. - ## TLS We encourage everyone to use HTTPS over HTTP, especially through the automatic TLS option, for [securing authentication](authentication.md) From 751e9ae944e7da7adda1e2d27c8cb6c4f54f1c6f Mon Sep 17 00:00:00 2001 From: Leo Meyerovich Date: Sun, 27 Oct 2024 17:48:40 -0700 Subject: [PATCH 3/5] docs(config): update --- docs/{ => app-config}/configure-python.md | 0 docs/app-config/configure.md | 45 ++++++++++++++++------- docs/app-config/index.rst | 3 +- 3 files changed, 33 insertions(+), 15 deletions(-) rename docs/{ => app-config}/configure-python.md (100%) diff --git a/docs/configure-python.md b/docs/app-config/configure-python.md similarity index 100% rename from docs/configure-python.md rename to docs/app-config/configure-python.md diff --git a/docs/app-config/configure.md b/docs/app-config/configure.md index 89b4a52..0ae0b6c 100644 --- a/docs/app-config/configure.md +++ b/docs/app-config/configure.md @@ -13,7 +13,9 @@ Administrators can add users, specify passwords, TLS/SSL, persist data across se See [user creation docs](../tools/user-creation.md) -## Top configuration places: data/config/custom.env, data/pivot-db/config/config.json +## Configuration places + +### Primary: data/config/custom.env * Graphistry is primarily configured by setting values in `data/config/custom.env` * Connector, ontology, and pivot configuration is optionally via `data/pivot-db/config/config.json`. Many relevant options are [detailed in a reference page](configure-investigation.md). @@ -23,10 +25,11 @@ Between edits, restart one or all Graphistry services: `docker compose stop` an We typically recommend doing targeted and localized restarts via `docker compose stop service1 service2 ...` and `docker compose up -d --force-recreate --no-deps service1 service2 ...`. Contact staff for guidance. -## Further configuration: docker-compose.yml and Caddyfile +### Secondary:: docker-compose.yml, Caddyfile, `pivot-db/` * More advanced administrators may edit `docker-compose.yml` . Maintenance is easier if you never edit it. * Custom TLS is via editing `Caddyfile`([Caddy docs](https://caddyserver.com/docs/automatic-https)), see below +* Visual playbooks may be configured via `data/pivot-db/config/config.json` ## SSO @@ -191,6 +194,14 @@ If problems persist, please reach out to your Graphistry counterparts. Additiona See the [email](email.md) section +## Python, PyGraphistry, & GFQL + +You may find it useful to customize specific endpoints: + +* [PyGraphistry](configure-pygraphistry.md) for how notebooks talk to your Graphistry instance +* [Python endpoint](configure-python.md) for how users can run arbitrary Python code against Graphistry datasets and leverage the server GPU +* GFQL Endpoint for how users can run queries against Graphistry datasets using GFQL: No configuration at this time + ## Site domain *Optional* @@ -199,6 +210,12 @@ In the Admin portal, go to Sites and change the `Domain name` to your domain, su This aids scenarios such as when using an outside proxy and ensuring that web users see the intended external domain instead of the internal one leaking through + +## Performance + +See [performance tuning](../debugging/performance-tuning.md) + + ## Reverse proxy ### Built-in proxying @@ -233,15 +250,21 @@ You can configure the Caddy service to also reverse proxy additional services, i For an example of both public and log-required proxies, see the [graph-app-kit sample](https://github.com/graphistry/graph-app-kit/blob/master/src/caddy/full.Caddyfile). -## Dashboards +## Streamlit Dashboards -Separately [configure the public and private dashboards](configure-dashboards.md) +Separately [configure the public and private Streamlit dashboards](configure-dashboards.md) -## Connectors + +## Visual Playbooks + +**Note:** We strongly recommend new users contact the Graphistry team about early access to Louie before starting new usage of the Visual Playbook environment. + + +### Connectors Optionally, you can configure Graphistry to use database connectors. Graphistry will orchestrate cross-database query generation, pushing them down through the database API, and returning the combined results to the user. This means Graphistry can reuse your existing scaleout infrastructure and make its data more accessible to your users without requiring a second copy to be maintained. Some connectors further support use of the [Graphistry data bridge](../tools/bridge.md) for proxying requests between a Graphistry cloud server and an intermediate on-prem data bridge instead of directly connecting to on-prem API servers. -### Security Notes +#### Security Notes * Graphistry only needs `read only` access to the database * Only one system-wide connector can be used per database per Graphistry virtual server at this time. Ex: You can have Splunk user 1 + Neo4j user 2 on one running Graphistry container, and Splunk user 3 + Neo4j user 2 on another running Graphistry container. @@ -360,8 +383,7 @@ In scenarios such as a Graphistry cloud server accessing on-prem API servers, an * Run a bastion server between Graphistry and your database, such as a new Splunk search head * Create fine-grained permissions by running multiple Graphistry virtual servers, with a new Splunk role per instance - -## Ontology +### Ontology See [custom ontology extensions](configure-ontology.md) and [settings reference page](configure-investigation.md) for full options. Topics include controlling: @@ -369,11 +391,6 @@ See [custom ontology extensions](configure-ontology.md) and [settings reference * Map Type -> color, icon, size * Map node/edge titles -## Pivots +### Pivots Every connector comes with a base set of pivots. See [custom pivots](configure-custom-pivots.md) for teaching Graphistry new pivots based on existing connectors and pivots. - -## Performance - -See [performance tuning](../debugging/performance-tuning.md) - diff --git a/docs/app-config/index.rst b/docs/app-config/index.rst index 4a5a612..5e4dfc7 100644 --- a/docs/app-config/index.rst +++ b/docs/app-config/index.rst @@ -33,5 +33,6 @@ System :maxdepth: 1 :titlesonly: - email + configure-python configure-pygraphistry + email From ff306d49001ce34df3f27c4b9db16e0b88e673cc Mon Sep 17 00:00:00 2001 From: Leo Meyerovich Date: Sun, 27 Oct 2024 17:53:22 -0700 Subject: [PATCH 4/5] fix(telemetry): images --- docs/tools/telemetry.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/docs/tools/telemetry.md b/docs/tools/telemetry.md index b8709e4..c455963 100644 --- a/docs/tools/telemetry.md +++ b/docs/tools/telemetry.md @@ -187,19 +187,21 @@ To use telemetry with Graphistry, you need to: 1. **Jaeger Dashboard:** Access the [Jaeger Dashboard URL](#jaeger-dashboard). 2. **Key Tracing Information:** - List of traces generated by the system for the graph rendering flow (for instance: show the trace list including a trace with errors). -![List of traces generated by the system for the graph rendering flow](../_static/img/jaeger-trace-list-including-trace-with-error.png) + +List of traces generated by the system for the graph rendering flow + - The root span for the graph rendering flow is `streamgl-viz: handleFalcorSocketConnections`. - The service that generates the root span for the graph rendering flow is `streamgl-viz`. - ETL dataset fetch spans from the Python ETL Service service. - Detailed spans for actions by the visualization service and GPU workers (for instance: inspecting trace with error). -![Detailed spans for actions by the visualization service and GPU workers](../_static/img/jaeger-inspecting-trace-with-error.png) +Detailed spans for actions by the visualization service and GPU workers ### Accessing Metrics 1. **Prometheus Dashboard:** Access the [Prometheus Dashboard URL](#prometheus-dashboard). 2. **Critical Metrics to Monitor:** - `worker_read_crashes_total`: Monitor GPU worker crashes. - File upload and dataset creation metrics in the Python ETL service (all metrics with the name `forge_etl_python_upload_*`, for instance: `forge_etl_python_upload_datasets_request_total`). -![File upload and dataset creation metrics in the Python ETL service](../_static/img/prometheus-forge-etl-python-metric-example.png) +File upload and dataset creation metrics in the Python ETL service ### GPU Monitoring with Grafana and NVIDIA Data Center GPU Manager @@ -207,7 +209,7 @@ To provide comprehensive monitoring of GPU performance, we utilize Grafana in co - **NVIDIA Data Center GPU Manager (DCGM):** [DCGM](https://developer.nvidia.com/dcgm) is a suite of tools for managing and monitoring NVIDIA GPUs in data centers. It provides detailed metrics on GPU performance, health, and utilization. - **Grafana:** Grafana is an open-source platform for monitoring and observability. It allows users to query, visualize, alert on, and explore metrics from a variety of data sources, including Prometheus. By default the Grafana instance has the metrics and GPU dashboard from the `DCGM exporter` service (see `DCGM Exporter Dashboards` in the Grafana main page). -![grafana-import-dcgm-dashboard-6](../_static/img/grafana-import-dcgm-dashboard-6.png) +grafana-import-dcgm-dashboard-6 ## Advanced configuration From 60c85438ca4a01063ab0f880da60caeb7ee62b21 Mon Sep 17 00:00:00 2001 From: Leo Meyerovich Date: Sun, 27 Oct 2024 17:54:03 -0700 Subject: [PATCH 5/5] docs(changelog) --- CHANGELOG.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 5184fac..27ca0ab 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,6 +10,7 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm ### Added * ReadTheDocs site: [https://graphistry-admin-docs.readthedocs.io/](https://graphistry-admin-docs.readthedocs.io/) +* Python endpoint ### Changed @@ -19,3 +20,7 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm * Sphinx port * CI + +### Fixed + +* Telemetry images