Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add documentation for python endpoint env var #46

Merged
merged 6 commits into from
Oct 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
### Added

* ReadTheDocs site: [https://graphistry-admin-docs.readthedocs.io/](https://graphistry-admin-docs.readthedocs.io/)
* Python endpoint

### Changed

Expand All @@ -19,3 +20,7 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm

* Sphinx port
* CI

### Fixed

* Telemetry images
28 changes: 28 additions & 0 deletions docs/app-config/configure-python.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Configure the Python Endpoing

The Python endpoint allows any user granted access a way to retrieve datasets stored within Graphistry and process it using unrestricted arbitrary user provided Python code. This Python code can include a limited set of external libraries such as `numpy` and `cudf`, in addition to `graphistry`, and can access all computational resources available to the forge-etl-python server including GPU compute. The result is returned to the user as a string or as JSON.

## Safe defaults

* Graphistry Hub: The Python endpoint is not available to Graphistry Hub users at this time

* Graphistry Enterprise: The Python endpoint must be explicitly turned on for regular Graphistry Enterprise users

The more restricted GFQL endpoint is default-on for both Graphistry Hub and Graphistry Enterprise

## Toggling

The endpoint must be both on in general, and individual user types explicitly allowed:

1. Enable access to individual users via the Graphistry admin panel's feature flag area

1. The flag must also be enabled at the system-level via the `ENABLE_PYTHON_ENDPOINT` environment variable in `data/config/custom.env`

We recommend checking individual user access before enabling the endpoint.

## Further reading

See also:

* The [Graphistry REST API for the Python endpoint](https://hub.graphistry.com/docs/Python/python-api/)
* The [Graphistry REST API for the GFQL endpoint](https://hub.graphistry.com/docs/GFQL/gfql-api/)
52 changes: 35 additions & 17 deletions docs/app-config/configure.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,18 +13,23 @@ Administrators can add users, specify passwords, TLS/SSL, persist data across se
See [user creation docs](../tools/user-creation.md)


## Top configuration places: data/config/custom.env, data/pivot-db/config/config.json
## Configuration places

* Graphistry is primarily configured through file `data/config/custom.env`
### Primary: data/config/custom.env

* Graphistry is primarily configured by setting values in `data/config/custom.env`
* Connector, ontology, and pivot configuration is optionally via `data/pivot-db/config/config.json`. Many relevant options are [detailed in a reference page](configure-investigation.md).

Between edits, restart one or all Graphistry services: `docker-compose stop` and `docker-compose up -d`
Between edits, restart one or all Graphistry services: `docker compose stop` and `docker compose up -d`.

We typically recommend doing targeted and localized restarts via `docker compose stop service1 service2 ...` and `docker compose up -d --force-recreate --no-deps service1 service2 ...`. Contact staff for guidance.


## Further configuration: docker-compose.yml and Caddyfile
### Secondary:: docker-compose.yml, Caddyfile, `pivot-db/`

* More advanced administrators may edit `docker-compose.yml` . Maintenance is easier if you never edit it.
* Custom TLS is via editing `Caddyfile`([Caddy docs](https://caddyserver.com/docs/automatic-https)), see below
* Visual playbooks may be configured via `data/pivot-db/config/config.json`

## SSO

Expand All @@ -38,7 +43,6 @@ Recommendations for SSO when self-hosting:
* Disallow non-SSO account creation
* Decide whether SSO users can automatically join organizations without an invitation


## TLS

We encourage everyone to use HTTPS over HTTP, especially through the automatic TLS option, for [securing authentication](../security/authentication.md)
Expand Down Expand Up @@ -190,6 +194,14 @@ If problems persist, please reach out to your Graphistry counterparts. Additiona

See the [email](email.md) section

## Python, PyGraphistry, & GFQL

You may find it useful to customize specific endpoints:

* [PyGraphistry](configure-pygraphistry.md) for how notebooks talk to your Graphistry instance
* [Python endpoint](configure-python.md) for how users can run arbitrary Python code against Graphistry datasets and leverage the server GPU
* GFQL Endpoint for how users can run queries against Graphistry datasets using GFQL: No configuration at this time

## Site domain

*Optional*
Expand All @@ -198,6 +210,12 @@ In the Admin portal, go to Sites and change the `Domain name` to your domain, su

This aids scenarios such as when using an outside proxy and ensuring that web users see the intended external domain instead of the internal one leaking through


## Performance

See [performance tuning](../debugging/performance-tuning.md)


## Reverse proxy

### Built-in proxying
Expand Down Expand Up @@ -232,15 +250,21 @@ You can configure the Caddy service to also reverse proxy additional services, i
For an example of both public and log-required proxies, see the [graph-app-kit sample](https://github.com/graphistry/graph-app-kit/blob/master/src/caddy/full.Caddyfile).


## Dashboards
## Streamlit Dashboards

Separately [configure the public and private Streamlit dashboards](configure-dashboards.md)


Separately [configure the public and private dashboards](configure-dashboards.md)
## Visual Playbooks

## Connectors
**Note:** We strongly recommend new users contact the Graphistry team about early access to Louie before starting new usage of the Visual Playbook environment.


### Connectors

Optionally, you can configure Graphistry to use database connectors. Graphistry will orchestrate cross-database query generation, pushing them down through the database API, and returning the combined results to the user. This means Graphistry can reuse your existing scaleout infrastructure and make its data more accessible to your users without requiring a second copy to be maintained. Some connectors further support use of the [Graphistry data bridge](../tools/bridge.md) for proxying requests between a Graphistry cloud server and an intermediate on-prem data bridge instead of directly connecting to on-prem API servers.

### Security Notes
#### Security Notes

* Graphistry only needs `read only` access to the database
* Only one system-wide connector can be used per database per Graphistry virtual server at this time. Ex: You can have Splunk user 1 + Neo4j user 2 on one running Graphistry container, and Splunk user 3 + Neo4j user 2 on another running Graphistry container.
Expand Down Expand Up @@ -359,20 +383,14 @@ In scenarios such as a Graphistry cloud server accessing on-prem API servers, an
* Run a bastion server between Graphistry and your database, such as a new Splunk search head
* Create fine-grained permissions by running multiple Graphistry virtual servers, with a new Splunk role per instance


## Ontology
### Ontology

See [custom ontology extensions](configure-ontology.md) and [settings reference page](configure-investigation.md) for full options. Topics include controlling:

* Map Column -> Type
* Map Type -> color, icon, size
* Map node/edge titles

## Pivots
### Pivots

Every connector comes with a base set of pivots. See [custom pivots](configure-custom-pivots.md) for teaching Graphistry new pivots based on existing connectors and pivots.

## Performance

See [performance tuning](../debugging/performance-tuning.md)

3 changes: 2 additions & 1 deletion docs/app-config/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,5 +33,6 @@ System
:maxdepth: 1
:titlesonly:

email
configure-python
configure-pygraphistry
email
10 changes: 6 additions & 4 deletions docs/tools/telemetry.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,27 +187,29 @@ To use telemetry with Graphistry, you need to:
1. **Jaeger Dashboard:** Access the [Jaeger Dashboard URL](#jaeger-dashboard).
2. **Key Tracing Information:**
- List of traces generated by the system for the graph rendering flow (for instance: show the trace list including a trace with errors).
![List of traces generated by the system for the graph rendering flow](../_static/img/jaeger-trace-list-including-trace-with-error.png)

<img alt="List of traces generated by the system for the graph rendering flow" src="../_static/img/jaeger-trace-list-including-trace-with-error.png"/>

- The root span for the graph rendering flow is `streamgl-viz: handleFalcorSocketConnections`.
- The service that generates the root span for the graph rendering flow is `streamgl-viz`.
- ETL dataset fetch spans from the Python ETL Service service.
- Detailed spans for actions by the visualization service and GPU workers (for instance: inspecting trace with error).
![Detailed spans for actions by the visualization service and GPU workers](../_static/img/jaeger-inspecting-trace-with-error.png)
<img alt="Detailed spans for actions by the visualization service and GPU workers" src="../_static/img/jaeger-inspecting-trace-with-error.png"/>

### Accessing Metrics
1. **Prometheus Dashboard:** Access the [Prometheus Dashboard URL](#prometheus-dashboard).
2. **Critical Metrics to Monitor:**
- `worker_read_crashes_total`: Monitor GPU worker crashes.
- File upload and dataset creation metrics in the Python ETL service (all metrics with the name `forge_etl_python_upload_*`, for instance: `forge_etl_python_upload_datasets_request_total`).
![File upload and dataset creation metrics in the Python ETL service](../_static/img/prometheus-forge-etl-python-metric-example.png)
<img alt="File upload and dataset creation metrics in the Python ETL service" src="../_static/img/prometheus-forge-etl-python-metric-example.png"/>

### GPU Monitoring with Grafana and NVIDIA Data Center GPU Manager

To provide comprehensive monitoring of GPU performance, we utilize Grafana in conjunction with NVIDIA Data Center GPU Manager (DCGM). These tools enable real-time visualization and analysis of GPU metrics, ensuring optimal performance and facilitating troubleshooting.
- **NVIDIA Data Center GPU Manager (DCGM):** [DCGM](https://developer.nvidia.com/dcgm) is a suite of tools for managing and monitoring NVIDIA GPUs in data centers. It provides detailed metrics on GPU performance, health, and utilization.
- **Grafana:** Grafana is an open-source platform for monitoring and observability. It allows users to query, visualize, alert on, and explore metrics from a variety of data sources, including Prometheus. By default the Grafana instance has the metrics and GPU dashboard from the `DCGM exporter` service (see `DCGM Exporter Dashboards` in the Grafana main page).

![grafana-import-dcgm-dashboard-6](../_static/img/grafana-import-dcgm-dashboard-6.png)
<img alt="grafana-import-dcgm-dashboard-6" src="../_static/img/grafana-import-dcgm-dashboard-6.png"/>

## Advanced configuration

Expand Down
Loading