Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify the Elastic Agent and the Horde agent emulator used in scale testing #2169

Open
cmacknz opened this issue Jan 24, 2023 · 6 comments
Open
Labels
Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@cmacknz
Copy link
Member

cmacknz commented Jan 24, 2023

Horde is our internal framework used for scale testing the agent and Fleet. It currently implements a much more lightweight emulated agent. This allows us to spin up many thousands of agents on a single host, but has the major drawback of not actually testing the agent itself.

We should work towards making it possible to share code between the agent itself and Horde. This could be done by extracting internal packages of the agent and reusing them Horde, but it is more appealing to produce a distribution of the agent specifically meant for scale testing to ensure the code is always in sync and used in the same way. We are most interested in directly testing the Fleet gateway code, action handling, and upgrades.

The initial idea for this would be to build a lightweight variant of the Elastic Agent that does not need to be installed to run (does not depend on any internal directory structure), can be started multiple times on the same host, and is not actually capable of starting and managing subprocesses. Possibly we produce a single binary that can automatically launch many instances of this simple agent as goroutines.

@joshdover
Copy link
Contributor

elastic/fleet-server#2519 (comment) is another case where this biting us. We cannot reasonably rely on horde to emulate Agent successfully while we maintain 2 implementations of the Fleet Server client code. Any small difference in behavior has quality and scalability implications as we've seen now several times.

We are going to be relying on the scaling suite to verify agent scale as part of every release to Serverless, making this even more critical.

@jlind23 @amitkanfer I'd like to consider putting this in a sprint 13.

@amitkanfer
Copy link
Contributor

fine by me.

@cmacknz
Copy link
Member Author

cmacknz commented May 8, 2023

I'm setting @pchila as the preliminary assignee here, we just spoke about this one. Paolo has been working with the code that needs to be shared here recently and he has some ideas about how to improve the testability of the agent in general with the changes that will be needed here.

@juliaElastic
Copy link
Contributor

@pierrehilbert @cmacknz Is this going to be a priority anytime soon?

@pierrehilbert
Copy link
Contributor

Sorry I missed this ping.
Still in our technical priority but won't be addressed soon.

@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Jun 3, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

No branches or pull requests

7 participants