Skip to content

Commit

Permalink
Merge branch 'origin/master' into stable-1.5
Browse files Browse the repository at this point in the history
  • Loading branch information
elfiesmelfie committed Mar 5, 2024
2 parents a33d4ac + fe90630 commit 9f9f050
Show file tree
Hide file tree
Showing 20 changed files with 248 additions and 215 deletions.
2 changes: 0 additions & 2 deletions common/global/stf-attributes.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@ endif::[]
ifeval::["{build}" == "upstream"]
:ObservabilityOperator: Observability{nbsp}Operator
:OpenShift: OpenShift
:OpenShiftShort: OKD
:OpenStack: OpenStack
:OpenStackShort: OSP
:OpenStackVersion: Wallaby
Expand All @@ -58,7 +57,6 @@ endif::[]
ifeval::["{build}" == "downstream"]
:ObservabilityOperator: Cluster{nbsp}Observability{nbsp}Operator
:OpenShift: Red{nbsp}Hat{nbsp}OpenShift{nbsp}Container{nbsp}Platform
:OpenShiftShort: OCP
:OpenStack: Red{nbsp}Hat{nbsp}OpenStack{nbsp}Platform
:OpenStackShort: RHOSP
:OpenStackVersion: 17.1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,17 +18,11 @@ ifdef::include_when_16[]
* xref:container-health-and-api-status_assembly-advanced-features[Monitoring container health and API status]
endif::include_when_16[]


//Dashboards
include::../modules/con_dashboards.adoc[leveloffset=+1]
include::../modules/proc_setting-up-grafana-to-host-the-dashboard.adoc[leveloffset=+2]
ifdef::include_when_16[]
// TODO: either rewrite or drop this procedure. We now provide the preferred downstream RHEL Grafana workload image in the deployment procedure.
//include::../modules/proc_overriding-the-default-grafana-container-image.adoc[leveloffset=+2]
include::../modules/proc_importing-dashboards.adoc[leveloffset=+2]
endif::include_when_16[]
include::../modules/proc_retrieving-and-setting-grafana-login-credentials.adoc[leveloffset=+2]

include::../modules/proc_connecting-an-external-dashboard-system.adoc[leveloffset=+2]

//Editing the metrics retention time period
include::../modules/con_metrics-retention-time-period.adoc[leveloffset=+1]
Expand Down Expand Up @@ -69,13 +63,10 @@ include::../modules/con_resource-usage-of-openstack.adoc[leveloffset=+1]
include::../modules/proc_disabling-resource-usage-monitoring-of-openstack-services.adoc[leveloffset=+2]

//Monitoring container health

include::../modules/con_container-health-and-api-status.adoc[leveloffset=+1]
include::../modules/proc_disabling-container-health-and-api-status-monitoring.adoc[leveloffset=+2]
endif::include_when_16[]



//reset the context
ifdef::parent-context[:context: {parent-context}]
ifndef::parent-context[:!context:]
Original file line number Diff line number Diff line change
Expand Up @@ -21,18 +21,18 @@ ifeval::["{SupportedOpenShiftVersion}" == "{NextSupportedOpenShiftVersion}"]
* {OpenShift} version {SupportedOpenShiftVersion} is running.
endif::[]
ifeval::["{SupportedOpenShiftVersion}" != "{NextSupportedOpenShiftVersion}"]
* An {OpenShift} version inclusive of {SupportedOpenShiftVersion} through {NextSupportedOpenShiftVersion} is running.
* An {OpenShift} Extended Update Support (EUS) release version {SupportedOpenShiftVersion} or {NextSupportedOpenShiftVersion} is running.
endif::[]
* You have prepared your {OpenShift} environment and ensured that there is persistent storage and enough resources to run the {ProjectShort} components on top of the {OpenShift} environment. For more information about {ProjectShort} performance, see the Red Hat Knowledge Base article https://access.redhat.com/articles/4907241[Service Telemetry Framework Performance and Scaling].
* Your environment is fully connected. {ProjectShort} does not work in a {OpenShift}-disconnected environments or network proxy environments.
* You have deployed {ProjectShort} in a fully connected or {OpenShift}-disconnected environments. {ProjectShort} is unavailable in network proxy environments.

ifeval::["{build}" == "downstream"]
[IMPORTANT]
ifeval::["{SupportedOpenShiftVersion}" == "{NextSupportedOpenShiftVersion}"]
{ProjectShort} is compatible with {OpenShift} version {SupportedOpenShiftVersion}
endif::[]
ifeval::["{SupportedOpenShiftVersion}" != "{NextSupportedOpenShiftVersion}"]
{ProjectShort} is compatible with {OpenShift} version {SupportedOpenShiftVersion} through {NextSupportedOpenShiftVersion}.
{ProjectShort} is compatible with {OpenShift} versions {SupportedOpenShiftVersion} and {NextSupportedOpenShiftVersion}.
endif::[]
endif::[]

Expand All @@ -42,6 +42,7 @@ endif::[]
* For more information about Operator catalogs, see https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/operators/understanding/olm-rh-catalogs.html[_Red Hat-provided Operator catalogs_].
* For more information about the cert-manager Operator for Red Hat, see https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/security/cert_manager_operator/index.html[_cert-manager Operator for Red Hat OpenShift overview_].
* For more information about {ObservabilityOperator}, see https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/monitoring/cluster_observability_operator/cluster-observability-operator-overview.html[_Cluster Observability Operator Overview_].
* For more information about OpenShift life cycle policy and Extended Update Support (EUS), see https://access.redhat.com/support/policy/updates/openshift[_Red Hat OpenShift Container Platform Life Cycle Policy_].

include::../modules/con_deploying-stf-to-the-openshift-environment.adoc[leveloffset=+1]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ ifeval::["{SupportedOpenShiftVersion}" == "{NextSupportedOpenShiftVersion}"]
{ProjectShort} is compatible with {OpenShift} version {SupportedOpenShiftVersion}
endif::[]
ifeval::["{SupportedOpenShiftVersion}" != "{NextSupportedOpenShiftVersion}"]
{ProjectShort} is compatible with {OpenShift} version {SupportedOpenShiftVersion} through {NextSupportedOpenShiftVersion}.
{ProjectShort} is compatible with {OpenShift} Extended Update Support (EUS) release versions {SupportedOpenShiftVersion} and {NextSupportedOpenShiftVersion}.
endif::[]
endif::[]

Expand All @@ -40,6 +40,7 @@ endif::[]
* https://access.redhat.com/documentation/en-us/openshift_container_platform/{NextSupportedOpenShiftVersion}/[{OpenShift} product documentation]
* https://access.redhat.com/articles/4907241[Service Telemetry Framework Performance and Scaling]
* https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/welcome/index.html#cluster-installer-activities[OpenShift Container Platform {NextSupportedOpenShiftVersion} Documentation]
* https://access.redhat.com/support/policy/updates/openshift[Red Hat OpenShift Container Platform Life Cycle Policy]



Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ To prepare your {OpenShift} environment for {Project} ({ProjectShort}), you must

* Ensure that you have persistent storage available in your {OpenShift} cluster for a production-grade deployment. For more information, see <<persistent-volumes_assembly-preparing-your-ocp-environment-for-stf>>.
* Ensure that enough resources are available to run the Operators and the application containers. For more information, see <<resource-allocation_assembly-preparing-your-ocp-environment-for-stf>>.
* Ensure that you have a fully connected network environment. For more information, see xref:con-network-considerations-for-service-telemetry-framework_assembly-preparing-your-ocp-environment-for-stf[].

include::../modules/con_observability-strategy.adoc[leveloffset=+1]
include::../modules/con_persistent-volumes.adoc[leveloffset=+1]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ Use the third-party application, Grafana, to visualize system-level metrics that
For more information about configuring data collectors, see xref:configuring-red-hat-openstack-platform-overcloud-for-stf_assembly-completing-the-stf-configuration[].

ifdef::include_when_16[]
//TODO: can re-work this once we have OSP13 dashboard(s) to show. Can't use container health checks or monitoring in OSP13.
You can use dashboards to monitor a cloud:

Infrastructure dashboard::
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
= Customizing the deployment

[role="_abstract"]
The Service Telemetry Operator watches for a `ServiceTelemetry` manifest to load into {OpenShift} ({OpenShiftShort}). The Operator then creates other objects in memory, which results in the dependent Operators creating the workloads they are responsible for managing.
The Service Telemetry Operator watches for a `ServiceTelemetry` manifest to load into {OpenShift}. The Operator then creates other objects in memory, which results in the dependent Operators creating the workloads they are responsible for managing.

[WARNING]
====
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
[id="con-network-considerations-for-service-telemetry-framework_{context}"]
= Network considerations for Service Telemetry Framework

You can only deploy {Project} ({ProjectShort}) in a fully connected network environment. You cannot deploy {ProjectShort} in {OpenShift}-disconnected environments or network proxy environments.
You can deploy {Project} ({ProjectShort}) in fully connected network environments or in {OpenShift}-disconnected environments. You cannot deploy {ProjectShort} in network proxy environments.
Original file line number Diff line number Diff line change
Expand Up @@ -87,10 +87,12 @@ ifeval::["{SupportedOpenShiftVersion}" == "{NextSupportedOpenShiftVersion}"]
* {OpenShift} {SupportedOpenShiftVersion}
endif::[]
ifeval::["{SupportedOpenShiftVersion}" != "{NextSupportedOpenShiftVersion}"]
* {OpenShift} {SupportedOpenShiftVersion} through {NextSupportedOpenShiftVersion}
* {OpenShift} Extended Update Support (EUS) releases {SupportedOpenShiftVersion} and {NextSupportedOpenShiftVersion}
endif::[]
* Infrastructure platform

For more information about the {OpenShift} EUS releases, see link:https://access.redhat.com/support/policy/updates/openshift[Red Hat OpenShift Container Platform Life Cycle Policy].

[[osp-stf-server-side-monitoring]]
.Server-side STF monitoring infrastructure
image::363_OpenStack_STF_updates_0923_deployment_prereq.png[Server-side STF monitoring infrastructure]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@
[role="_abstract"]
Red Hat supports the core Operators and workloads, including {MessageBus}, {ObservabilityOperator} (Prometheus, Alertmanager), Service Telemetry Operator, and Smart Gateway Operator. Red Hat does not support the community Operators or workload components, inclusive of Elasticsearch, Grafana, and their Operators.

You can only deploy {ProjectShort} in a fully connected network environment. You cannot deploy {ProjectShort} in {OpenShift}-disconnected environments or network proxy environments.
You can deploy {Project} ({ProjectShort}) in fully connected network environments or in {OpenShift}-disconnected environments. You cannot deploy {ProjectShort} in network proxy environments.

For more information about {ProjectShort} life cycle and support status, see the https://access.redhat.com/node/6225361[{Project} Supported Version Matrix].
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,18 @@
[role="_abstract"]
In {OpenShift}, applications are exposed to the external network through a route. For more information about routes, see https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/networking/configuring_ingress_cluster_traffic/overview-traffic.html[Configuring ingress cluster traffic].

In {Project} ({ProjectShort}), HTTPS routes are exposed for each service that has a web-based interface. These routes are protected by {OpenShift} RBAC and any user that has a `ClusterRoleBinding` that enables them to view {OpenShift} Namespaces can log in. For more information about RBAC, see https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/authentication/using-rbac.html[Using RBAC to define and apply permissions].
In {Project} ({ProjectShort}), HTTPS routes are exposed for each service that has a web-based interface and protected by {OpenShift} role-based access control (RBAC).

You need the following permissions to access the corresponding component UI's:

[source,json,options="nowrap"]
----
{"namespace":"service-telemetry", "resource":"grafana", "group":"grafana.integreatly.org", "verb":"get"}
{"namespace":"service-telemetry", "resource":"prometheus", "group":"monitoring.rhobs", "verb":"get"}
{"namespace":"service-telemetry", "resource":"alertmanager", "group":"monitoring.rhobs", "verb":"get"}
----

For more information about RBAC, see https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/authentication/using-rbac.html[Using RBAC to define and apply permissions].

.Procedure

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ EOF
+
[source,bash]
----
$ for o in alertmanager/default prometheus/default elasticsearch/elasticsearch grafana/default; do oc delete $o; done
$ for o in alertmanagers.monitoring.rhobs/default prometheuses.monitoring.rhobs/default elasticsearch/elasticsearch grafana/default-grafana; do oc delete $o; done
----
+
. To verify that all workloads are operating correctly, view the pods and the status of each pod:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@

[id="connecting-an-external-dashboard-system_{context}"]
= Connecting an external dashboard system

It is possible to configure third-party visualization tools to connect to the {ProjectShort} Prometheus for metrics retrieval. Access is controlled via an OAuth token, and a ServiceAccount is already created that has (only) the required permissions. A new OAuth token can be generated against this account for the external system to use.

To use the authentication token, the third-party tool must be configured to supply an HTTP Bearer Token Authorization header as described in RFC6750. Consult the documentation of the third-party tool for how to configure this header. For example link:https://grafana.com/docs/grafana/latest/datasources/prometheus/configure-prometheus-data-source/#custom-http-headers[Configure Prometheus - Custom HTTP Headers] in the _Grafana Documentation_.

.Procedure

. Log in to {OpenShift}.

. Change to the `service-telemetry` namespace:
+
[source,bash]
----
$ oc project service-telemetry
----

. Create a new token secret for the stf-prometheus-reader service account
+
[source,bash]
----
$ oc create -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: my-prometheus-reader-token
namespace: service-telemetry
annotations:
kubernetes.io/service-account.name: stf-prometheus-reader
type: kubernetes.io/service-account-token
EOF
----

. Retrieve the token from the secret
+
[source,bash]
----
$ TOKEN=$(oc get secret my-prometheus-reader-token -o template='{{.data.token}}' | base64 -d)
----

. Retrieve the Prometheus host name
+
[source,bash]
----
$ PROM_HOST=$(oc get route default-prometheus-proxy -ogo-template='{{ .spec.host }}')
----

. Test the access token
+
[source,bash]
----
$ curl -k -H "Authorization: Bearer ${TOKEN}" https://${PROM_HOST}/api/v1/query?query=up
{"status":"success",[...]
----

. Configure your third-party tool with the PROM_HOST and TOKEN values from above
+
[source,bash]
----
$ echo $PROM_HOST
$ echo $TOKEN
----

. The token remains valid as long as the secret exists. You can revoke the token by deleting the secret.
+
[source,bash]
----
$ oc delete secret my-prometheus-reader-token
secret "my-prometheus-reader-token" deleted
----

.Additional information

For more information about service account token secrets, see link:https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/nodes/pods/nodes-pods-secrets.html#nodes-pods-secrets-creating-sa_nodes-pods-secrets[Creating a service account token secret] in the _OpenShift Container Platform Documentation_.
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ To change the rule, edit the value of the `expr` parameter.
+
[source,bash,options="nowrap"]
----
$ curl -k --user "internal:$(oc get secret default-prometheus-htpasswd -ogo-template='{{ .data.password | base64decode }}')" https://$(oc get route default-prometheus-proxy -ogo-template='{{ .spec.host }}')/api/v1/rules
$ curl -k -H "Authorization: Bearer $(oc create token stf-prometheus-reader)" https://$(oc get route default-prometheus-proxy -ogo-template='{{ .spec.host }}')/api/v1/rules
{"status":"success","data":{"groups":[{"name":"./openstack.rules","file":"/etc/prometheus/rules/prometheus-default-rulefiles-0/service-telemetry-prometheus-alarm-rules.yaml","rules":[{"state":"inactive","name":"Collectd metrics receive count is zero","query":"rate(sg_total_collectd_msg_received_count[1m]) == 0","duration":0,"labels":{},"annotations":{},"alerts":[],"health":"ok","evaluationTime":0.00034627,"lastEvaluation":"2021-12-07T17:23:22.160448028Z","type":"alerting"}],"interval":30,"evaluationTime":0.000353787,"lastEvaluation":"2021-12-07T17:23:22.160444017Z"}]}}
----
Expand Down
Loading

0 comments on commit 9f9f050

Please sign in to comment.