Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix index pattern when querying ES and condition when searching logs #3765

Merged
merged 12 commits into from
Nov 28, 2023

Conversation

belimawr
Copy link
Contributor

@belimawr belimawr commented Nov 14, 2023

What does this PR do?

Two issues were causing flakiness on integration tests and are fixed by this PR:

  1. The pattern used to query ES was not working on serverless, this commit updates it to a pattern that works on both stateful and serverless as well as make it more specific to the indexes/data streams we want to query
  2. findESDocs did not wait for the data to be indexed, only to a successful query on ES. In some cases the documents the test wanted were not indexed yet, leading to 0 documents being returned and the test failing with no error in the logs/diagnostics. This is fixed by waiting for a document count > 0 and no error.

Why is it important?

It fixes flaky integration tests.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
    - [ ] I have added tests that prove my fix is effective or that my feature works
    - [ ] I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

## Author's Checklist

How to test this PR locally

Run the integration test that was failing with both serverless and stateful

STACK_PROVISIONER=serverless TEST_PLATFORMS="linux/amd64" SNAPSHOT=true mage -v integration:single TestLogIngestionFleetManaged
mage integration:clean
TEST_PLATFORMS="linux/amd64" SNAPSHOT=true mage -v integration:single TestLogIngestionFleetManaged

It is needed to clean the deployments because at the moment the integration test framework does not keep track of the type of stack it deployed, more information: #3756

Related issues

## Use cases
## Screenshots
## Logs

Questions to ask yourself

  • How are we going to support this in production?
  • How are we going to measure its adoption?
  • How are we going to debug this?
  • What are the metrics I should take care of?
  • ...

@belimawr belimawr added Team:Elastic-Agent Label for the Agent team skip-changelog labels Nov 14, 2023
@belimawr belimawr requested a review from a team as a code owner November 14, 2023 12:26
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@belimawr belimawr added backport-v8.11.0 Automated backport with mergify and removed backport-v8.11.0 Automated backport with mergify labels Nov 14, 2023
Copy link
Contributor

mergify bot commented Nov 14, 2023

This pull request does not have a backport label. Could you fix it @belimawr? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v./d./d./d is the label to automatically backport to the 8./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@elasticmachine
Copy link
Contributor

elasticmachine commented Nov 14, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-11-20T09:04:05.333+0000

  • Duration: 14 min 44 sec

❕ Flaky test report

No test was executed to be analysed.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages.

  • run integration tests : Run the Elastic Agent Integration tests.

  • run end-to-end tests : Generate the packages and run the E2E Tests.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@belimawr belimawr force-pushed the fix-integration-tests branch 2 times, most recently from bb34c4d to 94b2ffe Compare November 14, 2023 16:57
@belimawr belimawr force-pushed the fix-integration-tests branch from 34790b7 to 621b857 Compare November 15, 2023 09:24
@belimawr
Copy link
Contributor Author

The failure seems to be caused by: #3664

@belimawr
Copy link
Contributor Author

buildkite test this

@belimawr belimawr force-pushed the fix-integration-tests branch from eb9d8f3 to dbb399d Compare November 20, 2023 09:03
@belimawr belimawr force-pushed the fix-integration-tests branch 2 times, most recently from 4603baf to 31b8184 Compare November 23, 2023 14:03
The pattern used to query ES was not working on serverless, this
commit updates it to a pattern that works on both stateful and
serverless.
@belimawr belimawr force-pushed the fix-integration-tests branch 2 times, most recently from 787f635 to c1e4f96 Compare November 27, 2023 14:20
findESDocs now ensures the function returns at least one document.

Debug logs are also removed.
@belimawr belimawr force-pushed the fix-integration-tests branch from c1e4f96 to ff4edf2 Compare November 27, 2023 14:37
Copy link

SonarQube Quality Gate

Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@belimawr belimawr changed the title Fix index pattern when querying ES Fix index pattern when querying ES and condition when searching logs Nov 27, 2023
@belimawr
Copy link
Contributor Author

Folks, this PR is finally (well, hopefully) ready for a final review. I found another source of flakiness and fixed it in my last commit ff4edf2.

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@belimawr belimawr merged commit a03aa9c into elastic:main Nov 28, 2023
9 checks passed
@belimawr belimawr deleted the fix-integration-tests branch November 28, 2023 09:03
@belimawr belimawr added the backport-v8.11.0 Automated backport with mergify label Nov 28, 2023
mergify bot pushed a commit that referenced this pull request Nov 28, 2023
…3765)

Two issues were causing flakiness on integration tests and are fixed by this PR:
1. The pattern used to query ES was not working on serverless, this commit updates it to a pattern that works on both stateful and serverless as well as make it more specific to the indexes/data streams we want to query
2. `findESDocs` did not wait for the data to be indexed, only to a successful query on ES. In some cases the documents the test wanted were not indexed yet, leading to 0 documents being returned and the test failing with no error in the logs/diagnostics. This is fixed by waiting for a document count > 0 and no error.

(cherry picked from commit a03aa9c)

# Conflicts:
#	pkg/testing/tools/estools/elasticsearch.go
#	testing/integration/logs_ingestion_test.go
belimawr added a commit that referenced this pull request Nov 28, 2023
…3765)

Two issues were causing flakiness on integration tests and are fixed by this PR:
1. The pattern used to query ES was not working on serverless, this commit updates it to a pattern that works on both stateful and serverless as well as make it more specific to the indexes/data streams we want to query
2. `findESDocs` did not wait for the data to be indexed, only to a successful query on ES. In some cases the documents the test wanted were not indexed yet, leading to 0 documents being returned and the test failing with no error in the logs/diagnostics. This is fixed by waiting for a document count > 0 and no error.

(cherry picked from commit a03aa9c)

# Conflicts:
#	pkg/testing/tools/estools/elasticsearch.go
#	testing/integration/logs_ingestion_test.go
belimawr added a commit that referenced this pull request Nov 30, 2023
…on when searching logs (#3829)

Two issues were causing flakiness on integration tests and are fixed by this PR:
1. The pattern used to query ES was not working on serverless, this commit updates it to a pattern that works on both stateful and serverless as well as make it more specific to the indexes/data streams we want to query
2. `findESDocs` did not wait for the data to be indexed, only to a successful query on ES. In some cases the documents the test wanted were not indexed yet, leading to 0 documents being returned and the test failing with no error in the logs/diagnostics. This is fixed by waiting for a document count > 0 and no error.

(cherry picked from commit a03aa9c)

# Conflicts:
#	pkg/testing/tools/estools/elasticsearch.go
#	testing/integration/logs_ingestion_test.go

The `add_cloud_metadata` and some other `add_*_metadata` processors
are expected to log some errors if they cannot fetch the necessary
information. It is normal to find their error logs in pretty much any
deployment, some examples:
 - When Docker is not installed/running `add_docker_metadata` will log
 some errors
 - When the Elastic-Agent is deployed in a non-cloud VM
 `add_cloud_metadata` will log some errors.

This commit removes all those processors from the queries for log
errors, as they're expected.

Add a 2h timeout for `go test` when run on remote hosts.

---------

Co-authored-by: Tiago Queiroz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip backport-v8.11.0 Automated backport with mergify skip-changelog Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TestLogIngestionFleetManaged/Monitoring_logs_are_shipped is flaky
4 participants