Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update e2e with managed #1451

Open
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

coleenquadros
Copy link
Contributor

No description provided.

Signed-off-by: Coleen Iona Quadros <[email protected]>
Signed-off-by: Coleen Iona Quadros <[email protected]>
Signed-off-by: Coleen Iona Quadros <[email protected]>
Signed-off-by: Coleen Iona Quadros <[email protected]>
Signed-off-by: Coleen Iona Quadros <[email protected]>
Signed-off-by: Coleen Iona Quadros <[email protected]>
Signed-off-by: Coleen Iona Quadros <[email protected]>
Copy link

openshift-ci bot commented May 23, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: coleenquadros

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

thibaultmg and others added 15 commits May 23, 2024 11:52
* init version

Signed-off-by: Thibault Mange <[email protected]>

* fix

Signed-off-by: Thibault Mange <[email protected]>

* env test

Signed-off-by: Thibault Mange <[email protected]>

* change withReload naming

Signed-off-by: Thibault Mange <[email protected]>

---------

Signed-off-by: Thibault Mange <[email protected]>
* Add support for custom alertmanager hub URL

Signed-off-by: Douglas Camata <[email protected]>

* Add some missing contexts args

Signed-off-by: Douglas Camata <[email protected]>

* Add tests for `GetAlertmanagerEndpoint`

Signed-off-by: Douglas Camata <[email protected]>

---------

Signed-off-by: Douglas Camata <[email protected]>
…trics generation (stolostron#1425)

* metrics-collector simulator: add missing perms

A couple of additional permissions seems to be needed for the metrics
collector simulator to work correctly. This commit adds the needed
clusterrole and rolebinding.

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* metrics-collector simulator: metrics generator fix

This commit contains various changes to make the generation of metrics
for the metrics-collector simulator work.

- For getting recording rules, use jq to urlencode data when querying
  in-cluster prometheus. Before we removed spaces from the rules,
  but that caused problems with rules that might contain two keywords
  after each other such as `.. or  sum(..`
- Don't use `ROOTDIR` as it could be undefined, and just use `WORKDIR`
- Fix installation of various tools that used the misspelled `WORK_DIR`
  variable and therefore didn't work
- Install gojsontoyaml from release tarballs instead of via Go (which
  had some problems on my system)
- Update jq to 1.7.1
- Potentially make tools downloads work for arm macs (untested)

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* metrics-collector simulator: use install-binaries

Use the common install binaries script to install binaries instead of
the custom solution in these files.

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* install-binaries: bump jq & fix jq on arm macs

Bump jq to latest release 1.7.1. This should fix jq install on arm64
macs, as the previous 1.6 release was not compiled for this target and
the link would result in a 404.

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* install-binaries: don't call directly if sourced

If install-binaries was sourced from another script, and that script was
called with cmd line args, an error would be returned such as the below:

```
install-binaries.sh: line 123: -n: command not found
```

This due to the `$*` at the end of the install-binaries script.

With this commit we check if the file has been sourced or not, and only
makes the call to `$*` if the install-binaries script is called
directly.

Signed-off-by: Jacob Baungard Hansen <[email protected]>

---------

Signed-off-by: Jacob Baungard Hansen <[email protected]>
* retry status update on conflict

Signed-off-by: Thibault Mange <[email protected]>

* add maxConditions handling

Signed-off-by: Thibault Mange <[email protected]>

* return err

Signed-off-by: Thibault Mange <[email protected]>

* sort status condition

Signed-off-by: Thibault Mange <[email protected]>

---------

Signed-off-by: Thibault Mange <[email protected]>
* Tests: Don't check remote_write_requests on spokes

Previously the test "Should have acm_remote_write_requests_total
metrics with correct labels/value" looked for the metric for both the
hub cluster and any spokes. However, the spokes isn't supposed to expose
this metric, hence the test would always fail if any managed clusters
were added to the test setup.

Signed-off-by: Jacob Baungard Hansen <[email protected]>

* Examples: remove `cleanupInterval`

This configuration option no longer exists, so removing from example
files used in tests. Avoids the following warnings:

```
W0513 07:54:14.364463   11051 warnings.go:70] unknown field "spec.advanced.retentionConfig.cleanupInterval"
```

Signed-off-by: Jacob Baungard Hansen <[email protected]>

---------

Signed-off-by: Jacob Baungard Hansen <[email protected]>
* apply security restrictions

Signed-off-by: Thibault Mange <[email protected]>

* add privileged

Signed-off-by: Thibault Mange <[email protected]>

---------

Signed-off-by: Thibault Mange <[email protected]>
We add this metric to the allowlist as it will be used to optimize
dashboard performance for the fleet wide CPU widgets.

Signed-off-by: Jacob Baungard Hansen <[email protected]>
…stolostron#1442)

* Add support for alertmanager path prefix and validate URL

Signed-off-by: Douglas Camata <[email protected]>

* Force a protocol on the AM url

Signed-off-by: Douglas Camata <[email protected]>

* Reconcile placement controller when custom AM url changes in the mco

Signed-off-by: Douglas Camata <[email protected]>

* Fix mco predicate issue with nil advanced config

Signed-off-by: Douglas Camata <[email protected]>

* Only accept https for custom alertmanager url

Signed-off-by: Douglas Camata <[email protected]>

---------

Signed-off-by: Douglas Camata <[email protected]>
* change test ContainManagedClusterMetric

Signed-off-by: Thibault Mange <[email protected]>

* improve e2e test logging

Signed-off-by: Thibault Mange <[email protected]>

* add kube_debug file

Signed-off-by: Thibault Mange <[email protected]>

* remove unused functions

Signed-off-by: Thibault Mange <[email protected]>

* reduce log lines

Signed-off-by: Thibault Mange <[email protected]>

* log statefulsets and daemonsets

Signed-off-by: Thibault Mange <[email protected]>

* add copyright

Signed-off-by: Thibault Mange <[email protected]>

* fix pods list, add cm and secrets list

Signed-off-by: Thibault Mange <[email protected]>

---------

Signed-off-by: Thibault Mange <[email protected]>
jacobbaungard and others added 6 commits May 23, 2024 11:53
When the memory limit of MCO was bumped from 1Gi to 2Gi in
c3e1e94 it was updated only in the
generated manifests, and not source used to generate said manifests. As
a result, when we ran `make bundle` the limit was reverted. This was
done in e97c825 (note this was later
partially reverted in d742778 but the
memory limit was not). The end result was the limit being lowered from
2Gi to 1Gi again causing problems on systems with a large number of
managed clusters.

In this commit we raise the limit to 3Gi. 3Gi might seem like a high
limit, however we occasionally saw OOMs with a 2Gi limit on large
perfscale tests. Further, since 2.9 we no longer have any sensible ways
to increase these limits on systems that need it, so it's better to have
a large margin on the limit for now.

Signed-off-by: Jacob Baungard Hansen <[email protected]>
* Expose new promql-engine opt-out option in CR

Signed-off-by: Saswata Mukherjee <[email protected]>

* make bundle

Signed-off-by: Saswata Mukherjee <[email protected]>

---------

Signed-off-by: Saswata Mukherjee <[email protected]>
* Do not add custom obs api url to certs SAN

Signed-off-by: Douglas Camata <[email protected]>

* Fix unit tests

Signed-off-by: Douglas Camata <[email protected]>

* Avoid collision with package name

Signed-off-by: Douglas Camata <[email protected]>

* Rename URL.Host to HostPath

Signed-off-by: Douglas Camata <[email protected]>

---------

Signed-off-by: Douglas Camata <[email protected]>
* add github workflow

Signed-off-by: Thibault Mange <[email protected]>

* add build step

Signed-off-by: Thibault Mange <[email protected]>

---------

Signed-off-by: Thibault Mange <[email protected]>
Signed-off-by: Coleen Iona Quadros <[email protected]>
Signed-off-by: Coleen Iona Quadros <[email protected]>
Signed-off-by: Coleen Iona Quadros <[email protected]>
@coleenquadros coleenquadros marked this pull request as ready for review May 23, 2024 11:31
@coleenquadros coleenquadros marked this pull request as draft May 23, 2024 11:36
Signed-off-by: Coleen Iona Quadros <[email protected]>
@coleenquadros coleenquadros force-pushed the update_e2e_with_managed branch from aedd94e to b445cd2 Compare May 23, 2024 11:38
@coleenquadros coleenquadros marked this pull request as ready for review May 23, 2024 12:31
Signed-off-by: Coleen Iona Quadros <[email protected]>
Copy link

openshift-ci bot commented May 23, 2024

Thanks for your pull request. Before we can look at it, you'll need to add a 'DCO signoff' to your commits.

📝 Please follow instructions in the contributing guide to update your commits with the DCO

Full details of the Developer Certificate of Origin can be found at developercertificate.org.

The list of commits missing DCO signoff:

  • 0fdcef9 Merge branch 'main' into update_e2e_with_managed

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

Copy link

openshift-ci bot commented May 23, 2024

@coleenquadros: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-kind bf28bc3 link true /test e2e-kind

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-robot
Copy link
Collaborator

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@philipgough
Copy link
Contributor

@coleenquadros is this still needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants