Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs #2213

Merged
merged 2 commits into from
Mar 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/install/min-prod-hw.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Minimal Production System Recommendations
-----------------------------------------

* **CPU** - at least 2 physical cores/ 4vCPUs
* **CPU** - For clusters with up to 100 cores use 2vCPUS, for larger clusters 4vCPUs
* **Memory** - 15GB+ DRAM and proportional to the number of cores.
* **Disk** - persistent disk storage is proportional to the number of cores and Prometheus retention period (see the following section)
* **Network** - 1GbE/10GbE preferred
Expand Down
60 changes: 60 additions & 0 deletions docs/source/procedures/datadog/cloud-integration.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
=============================================
ScyllaDB Cloud Monitoring Datadog Integration
=============================================

For security reasons, the ScyllaDB cloud does not have direct access to the Prometheus server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"For security reasons, the ScyllaDB cloud does not have direct access to the Prometheus server." This can be removed; it does not add useful information. We might move away from Prometheus in the future.

To allow external server scrapping, you will need to enable the Prometheus proxy.
The Datadog agent reads from the proxy, which reads from the Promethues server.

1. Installing and configuring the Datadog Agent.
2. Add Datadog recording rules.
3. Loading Scylla dashboard to Datadog.
4. Optionally load Monitor (Alerts).

Scylla Monitoring Datadog Integration Overview
==============================================
A typical ScyllaDB cluster generates thousands of metrics, sometimes even tens of thousands.
The sheer number of metrics is too much for Datadog.

Instead of letting the Datadog agent scrap all metrics, the monitoring stack marks a small subset of metrics with a label and lets the Datadog agent scrap only those.

Install And configure the Datadog Agent
=======================================

Start by following `Installation <https://docs.datadoghq.com/agent/>`_ guide. The datadog agent should run on a machine that can reach the Prometheus Proxy server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not clear whre to install the agent.
Maybe something like:
Start by installing DataDog Agent on a server with access to ScyllaDB Cloud Prometheus Proxy server...

A small digram would help.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Once the Datadog agent is working, download the configuration file and place it under /etc/datadog-agent/conf.d/prometheus.d/conf.yaml

Download the configuration file :download:`conf.yaml <cloud-conf.yaml>` move it to: /etc/datadog-agent/conf.d/prometheus.d/conf.yaml


Edit the file. You must replace the cluster id (CLUSTER_ID) and the token (TOKEN).

Post configuration
^^^^^^^^^^^^^^^^^^
Restart the agent based on your installation. Scylla metrics should be visible in Datadog.


.. note:: By default, Datadog will not scrap per-shard metrics. To enable per-shard metrics, edit the conf.yaml file and replace dd=~"1" with dd=~"1|2"

Upload the Dashboard
====================
Download the dashboard file :download:`dashboard.json <dashboard.json>`.
Create a new dashboard in Datadog and import the json file you downloaded.

Using the Dashboard
===================
We created a Datadog dashboard that resembles the Grafana dashboards.

.. image:: datadog.png

The dashboard contains some specific filtering and perspectives:
First, you can choose between shard, instance, dc, or cluster view.
This will aggregate the metrics in the graphs accordingly.
Second, you can filter to see specific shards, nodes, or DCs.

.. note:: Pay attention that some of the combinations are conflicting. For example, you cannot filter by DC when looking at a cluster view. If no data displayed, remove the filters first.

Adding Monitor
==============
Alerts in Datadog called Monitor. Download the monitor file :download:`monitor.json <monitor.json>`. Go to the Monitor section in datadog and import the json.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

18 changes: 3 additions & 15 deletions docs/source/procedures/datadog/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ The integration consists of:
3. Loading Scylla dashboard to Datadog.
4. Optionally load Monitor (Alerts).

.. note:: Scylla Cloud users, use and update the proper configuration file.
.. note:: Scylla Cloud users, Check the cloud users `specific guide <cloud-integration>`_.

Scylla Monitoring Datadog Integration Overview
==============================================
Expand All @@ -31,17 +31,7 @@ Install And configure the Datadog Agent
Start by following `Installation <https://docs.datadoghq.com/agent/>`_ guide. The datadog agent should run on a machine that can reach the Prometheus server.

Once the Datadog agent is working, download the configuration file and place it under /etc/datadog-agent/conf.d/prometheus.d/conf.yaml

Scylla Cloud Users
^^^^^^^^^^^^^^^^^^
Scylla Cloud users, download the configuration file :download:`conf.yaml <cloud-conf.yaml>` move it to: /etc/datadog-agent/conf.d/prometheus.d/conf.yaml


Edit the file. You must replace the cluster id (CLUSTER_ID) and the token (TOKEN).

Other Scylla Users
^^^^^^^^^^^^^^^^^^
Other Scylla users, download the configuration file :download:`conf.yaml <conf.yaml>` and replace the ip address of the Prometheus server.
Download the configuration file :download:`conf.yaml <conf.yaml>` and replace the ip address of the Prometheus server.


Post configuration
Expand All @@ -53,11 +43,9 @@ Restart the agent based on your installation. Scylla metrics should be visible i

Add datadog recording rules
===========================
Non Scylla Cloud users, download the rules configuration file :download:`datadog.rules.yml <datadog.rules.yml>` if you need per-shard metrics, download :download:`datadog.rules-with-shards.yml <datadog.rules-with-shards.yml>` and place it under prometheus/prom_rules/.
Download the rules configuration file :download:`datadog.rules.yml <datadog.rules.yml>` if you need per-shard metrics, download :download:`datadog.rules-with-shards.yml <datadog.rules-with-shards.yml>` and place it under prometheus/prom_rules/.
Per-shards metrics adds load and cost to both the Prometheus server and Datadog agent and server, so only use it if needed.

Cloud users, skip this step, it's been take care for by the cloud.

Upload the Dashboard
====================
Download the dashboard file :download:`dashboard.json <dashboard.json>`.
Expand Down
3 changes: 3 additions & 0 deletions docs/source/procedures/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,16 @@ ScyllaDB Monitoring Stack Procedures
:hidden:


Cloud Users Datadog integration <datadog/cloud-integration>
Datadog Integration <datadog/index>
Alert Manager <alerts/index>
Adding and Modifying Dashboards <updating-dashboard>
Upgrade Guides </upgrade/index>

There are several reference guides available which give additional information. Choose a topic to begin:

* :doc:`Cloud Users Datadog integration <datadog/cloud-integration>`
* :doc:`Datadog Integration <datadog/index>`
* :doc:`Alert Manager <alerts/index>`
* :doc:`Adding and Modifying Dashboards <updating-dashboard>`
* :doc:`Upgrade Guides </upgrade/index>`
Loading