Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recover CouchDB monitoring post Alma9 migration #12230

Open
amaltaro opened this issue Jan 16, 2025 · 0 comments
Open

Recover CouchDB monitoring post Alma9 migration #12230

amaltaro opened this issue Jan 16, 2025 · 0 comments

Comments

@amaltaro
Copy link
Contributor

Impact of the new feature
Central CouchDB instances

Is your feature request related to a problem? Please describe.
This issue is a result of the migration of CouchDB services to Alma9, where we started adopting a CouchDB image - provided with CMSKubernetes/pull/1502 - instead of the old RPM-base deployment in the Openstack VMs.

With that migration, it looks like the vm.args file wasn't configured correctly, causing the CouchDB instances to no longer have a specific name, now it shows in the logs as nonode@nohost. There is more to this vm.args file, as database content is actually associated to the node name - so extra care is needed with that, once we get to this.

In addition, there is no more process scrapping CouchDB metrics and pushing it to monitoring. IIRC, last discussions with Aroosha suggested that we would be running a second container in those VMs only for scrapping and pushing metrics upstream. It needs to be confirmed though.

Describe the solution you'd like
We need to recover the CouchDB monitoring dashboard for central CouchDBs.

In addition to have data again available in MonIT and properly separated by each CouchDB instance, we need to have the proper exporters (couchdb exporter from prometheus?) running along with each CouchDB instance.

Describe alternatives you've considered
This issue needs to be addressed with the HTTP team (Aroosha).

For extra context, a somehow recent configuration change that we made was provided with dmwm/deployment#1345

Additional context
None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

1 participant