Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import master to stable-1.5 #509

Merged
merged 96 commits into from
Oct 27, 2023
Merged

Import master to stable-1.5 #509

merged 96 commits into from
Oct 27, 2023

Conversation

leifmadsen
Copy link
Member

Import changes for STF 1.5.3

csibbitt and others added 30 commits October 14, 2022 11:54
* Move the SNMP trap delivery checks

Move the SNMP trap delivery checks as where they are situated now seems
to cause false positives. Moves the checks closer to the end of the
smoketest run seems to result in a better change that the logs the check
is looking for have been provided.

* Use a loop to check for SNMP status with break and max time
* Make all certs 8yr expiry
* Use certificate_duration and test against generated cert
* Better messages during CI cloning
* Expand support for OCP 4.11

Allow installation to be done on OCP 4.11 while updating the smoketest
jobs to support later versions of the client. Also migrate to using
community-operators CatalogSource instead of OperatorHub.io. Only enable
community-operators when the use_community strategy is enabled.

Update the token request syntax when requesting a service account token.
Add checks to look for oc client version and fail if we're using a
version that's too old.

* Make passwords safer in smoketest job template

Encapsulate the password values with double quotes to help make them
safer for consumption in the template. I had an odd situation where the
password contained a bunch of extended characters and caused the
smoketest to report an error on the template having an issue with yaml
to json.

The password contained several characters such as . and : which confused
the template. Wrapping the contents in the double quotes allowed the
smoketest to apply the job.batch template and result in a working
smoketest run.
* Replacing the placeholder namespace during the build results in a "there are local changes" error on next build
* This forces the checkout to discard that (and other!?) local changes
* Quicker dev/test loop
* Update oc to 4.11 in jenkins agent

Need 4.11 for new token handling changes
Remove the OperatorHub.io CatalogSource and instead use the
community-operators CatalogSource which is available with an OCP
installation. Ideally this will avoid some of the conflicts we've been
seeing in our CI environment. This is a short term fix as future
development will likely make use of Observability Operator to provide
the metrics data store and alert delivery mechanism.
* Catalog changes
* CI change to pre-clean cert-manager-operator
  * not 100% sure this is 4.12 related, but it's new and first seen during testing 4.12
* Remove Loki from stf-run-ci

* Return "Get new operator sdk" to stf-run-ci
The GitHub Actions checkout action v2 is deprecated and needs to move to
version 3.
* Implement SNMPtrap delivery controls

Implement ability to override the default values for the SNMPtrap
alertmanager receiver via prometheus-webhook-snmp component.

Closes: STF-559

* Run operator-sdk generate bundle

Run the following command to update the bundle artifacts:

operator-sdk-0.19.4 generate bundle   --metadata   --manifests   --channels unstable   --default-channel unstable

* Build out the remaining SNMP options

Build out the remaining options for prometheus-webhook-snmp to allow for
finer grained controls and delivery of SNMP traps via alertmanager
alerts.

* Generate bundle contents with operator-sdk
* Implement changes for operator-sdk-1.26.0 testing

Implement changes that allow testing validation via operator-sdk-1.26.0
without bumping the entire bundle generation process from
operator-sdk-0.19.4 to post-operator-sdk-1.x.

These are the same tests run for validation during product pipeline
verification.

* Adds test to verify building of the bundle image works.
* Adds KinD deployment to allow executing scorecard checks.

Related: STF-1252

* Fix properties.yaml

* Simplify use of RELEASE_VERSION variable (#412)

* Add note about why we're copying files in
* Adds duration param for CA and endpoint certs

Replaces certificate_duration for ca_certificate_duration
and endpoint_certificate_duration. Set default value for those
to 70080h (previous value)

Removes the certificate_duration param from the Issuer
resource since it's not actually needed (see [0])

[0] https://cert-manager.io/docs/reference/api-docs/#cert-manager.io/v1.IssuerConfig

* Exposes CA and endpoint certificate duration config

Exposes certificate duration config for both ElasticSearch
and QDR

Keeps the default value in use for now. Better default values
should be discussed to be included in a follow up change.

* Fix identation for certs duration param in servicetelemetry crd

* Adds cert duration to the OLM catalog

Includes cert duration params in the OLM catalog
for both ElasticSearch and QDR

* Changes snake_case to camelCase to yaml case

Fix to match style convention

* Adds pattern expresion for certs duration

* Add certificates param to events and transport

* Exposes duration parameter in the CI script

Adds the duration parameter for both ElasticSearch and QDR
in the CI script

Also updates the OLM Catalog with the latest changes (certificates object)

* Corrects naming to certificates params in CI script

* Fix snake cae in the CI script params for cert duration

* Fix identation for transports in the deploy_stf CI script

---------

Co-authored-by: Chris Sibbitt <[email protected]>
* Fix PROMETHEUS_K8S_TOKEN to account for oc version mismatch

* Fix to use correct variable name in test
* Move to newest oauth-proxy container

* Move to bcrypt for htpasswd

* Up/Downstream image handling for new oauth-proxy container

* Skip broken ansible lint for htpasswd

* Hack to add EPEL in upstream builds
…ion (#402)

* [jenkins] Add custom context labels for github build status notification

Multiple jenkins deployment can now be run and report their build
status separately instead of both reporting to the same
``continuous-integration/jenkins/pr-merge`` and overriding each other.

There is now ``continuous-integration/jenkins/ocp-<OCP_VERSION>/pr-merge``
NOTE: the OCP_VERSION is hardcoded at the moment

https://github.com/jenkinsci/github-scm-trait-notification-context-plugin

* [jenkins] Make the build status label configurable

Added the OCP_VERSION var to the casc-configmap.yaml, so that the
correct label can be set for the jobs

---------

Co-authored-by: Chris Sibbitt <[email protected]>
Default to Observability Operator for Prometheus & Alertmanager

- Adds two options to observability_strategy:
  * use_redhat (OBO Only)
  * use_hybrid (OBO + friends)
- Default to redhat supported components in deployment
- Use upstream source for ObO in CI
- Added sensubility to deployment validation
- Added required RBAC for OBO usage
-When no explicit observability_strategy is set:
  * Existing STF objects get an explicit "use_community" added
  * New STF objects get the default ("use_redhat") explicitly added
- Narrow the scope of the smoke test (#422)

Co-authored-by: Leif Madsen <[email protected]>
* Use stable-v1 cert-manager in CI for OCP >= 4.12
* Currently we are installing :latest
* We and OBO currently install v2.43.0
* This will mitigate any potential for a prometheus roll-back
* We will remove the pin in STF 1.5.5 after migration to OBO is complete
* The check for a missing observabilityStrategy was faulty
* STF object would be updated to `observabilityStrategy: use_community`
  any time there was a community Prometheus deployed
Allow bundle deployments to happen from unauthorized container
repositories.
elfiesmelfie and others added 16 commits October 11, 2023 16:50
* [stf-run-ci] Generate extra logs if preflight checks fail

* Update preflight checks
Bumps [oauthlib](https://github.com/oauthlib/oauthlib) from 3.2.0 to 3.2.2.
- [Release notes](https://github.com/oauthlib/oauthlib/releases)
- [Changelog](https://github.com/oauthlib/oauthlib/blob/master/CHANGELOG.rst)
- [Commits](oauthlib/oauthlib@v3.2.0...v3.2.2)

---
updated-dependencies:
- dependency-name: oauthlib
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Emma Foley <[email protected]>
* [stf-run-ci][create_catalog] Swap query for a command task

Using query looks up the kubeconfig on localhost, rather than the host
that ansible is executing against. This behaviour is different from
either using the shell/command modules or using k8s modules.
For consistent behaviour, the queries are replaced with an alternative
way to get the same information that will have consistent behahaviour
whether executing against localhost or a remote host.
* Add requires infrastructure annotations

Add required infrastructure annotations for the bundle. Implementation
is done in generate_bundle.sh because the annotations.yaml file in the
deploy/olm-catalog/ directory is not read by operator-sdk-0.19.4. Append
required additional feature annotations to the generated
annotations.yaml by operator-sdk generate bundle.

Related STF-1530

* Include annotations in the CSV directly

* Revert "Add requires infrastructure annotations"

This reverts commit c9e9b2a.

* Generate CSV contents with operator-sdk
* [stf-collect-logs] Add a role for log collection

* Update build/run-ci.yaml

* [stf-collect-logs] Update the resource name in

* [stf-collect-logs] Update README

* [stf-collect-logs] Remove unnecessary lines

* ci/post-collect_logs: Use stf-collect-logs role

* [stf-collect-logs]: Use namespace in oc commands

---------

Co-authored-by: Chris Sibbitt <[email protected]>
* [zuul] Add job to deploy from nightly bundles

This job doesn't build STF, but deploys from the pre-built and published
bundles.
This is useful to be able to do periodically to make sure our latest
bundles are deploying, and no dependencies are out-of-date, for example
Support STF 1.5 from OCP 4.11 through 4.14 for the next release as OCP 4.10 is now EOL.
* Update base image to pass security scans

Update the base image with a dnf update (need to excluse ansible because
ansible updates aren't compatible with the current build). This keeps
packages up to date to allow the resulting image to pass registry
security scans at the expence of image size.

* Clean up intermediate layer

Co-authored-by: Chris Sibbitt <[email protected]>

* Add comments to help understand Dockerfile readout

* Spellcheck fix

---------

Co-authored-by: Chris Sibbitt <[email protected]>
Add cluster observability operator as the preferred dependency (bottom
of list is highest priority) when installing Service Telemetry Operator.
The cluster-observability-operator is the name of the downstream
(product) bundle in the Red Hat Operators CatalogSource.

If installing for upstream, preferred operator will be
observability-operator (when the Red Hat Operators CatalogSource is not
available or enabled). And then as a fall-back method when neither
Observability Operator or Cluster Observability Operator is not
available, allow for Prometheus Operator from the Community Operators to
satisfy for the Prometheus storage backend.
When deploying from the UI, don't populate the events SGs by default as
they are no longer used in a default configuration in RHOSP.
I forgot to run a generate bundle yesterday in some furious patch work.
@leifmadsen
Copy link
Member Author

@elfiesmelfie not sure if we'll need changes here to get this to pass/land?

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/812315fa1a4d46128eaaaba40660e885

stf-crc-latest-nightly_bundles MERGE_CONFLICT in 2s
stf-crc-latest-local_build MERGE_CONFLICT in 2s

@leifmadsen leifmadsen added the help wanted Extra attention is needed label Oct 26, 2023
@leifmadsen leifmadsen self-assigned this Oct 26, 2023
@leifmadsen leifmadsen added the 1.5 label Oct 26, 2023
Drop .zuul.yaml for stable-1.5 since it's not setup for non-main testing
at this point. In the future we may develop a separate set of tests for
the stable-1.5 branch during merge, but not for this initial import.
We'll rely on Jenkins testing for our functional validations.
@leifmadsen
Copy link
Member Author

The old "Linting" test was updated and renamed, so I'm skipping it (since it can't be run).

@leifmadsen leifmadsen merged commit 03a5873 into stable-1.5 Oct 27, 2023
6 checks passed
@leifmadsen leifmadsen deleted the import/stf153 branch October 27, 2023 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.5 help wanted Extra attention is needed
Development

Successfully merging this pull request may close these issues.

7 participants