Skip to content

Latest commit

 

History

History
203 lines (158 loc) · 11.3 KB

TESTING.md

File metadata and controls

203 lines (158 loc) · 11.3 KB

APM Server-Testing

Automated Testing

The tests are built on top of the Beats Test Framework, where you can find a detailed description on how to run the test suite.

Quick Overview

To run the unit tests, you can use make test or simply go test ./.... The unit tests do not require any external services.

The APM Server "system tests" run the APM Server in various scenarios, with the Elastic Stack running inside Docker containers. To run the system tests locally, you can run go test inside the systemtest directory.

Snapshot-Testing

Some tests make use of the concept of snapshot or approvals testing. If running tests leads to changed snapshots, you can use the approvals tool to update the snapshots. Following workflow is intended:

  • Run make update to create the approvals binary that supports reviewing changes.
  • Run make test, which will create a *.received.json file for every newly created or changed snapshot.
  • Run make check-approvals to review and interactively accept the changes.

Micro Benchmarking

To run simple benchmark tests, run:

make bench

A good way to present your results is by using benchcmp. With your changes in the current working tree, do:

$ go get -u golang.org/x/tools/cmd/benchcmp
$ make bench > new.txt
$ git checkout main
$ make bench > old.txt
$ benchcmp old.txt new.txt

Macro Benchmarking

The macro benchmarking focuses on measuring the APM Server's performance (throughput) and how changes in the codebase impact that performance.

Our legacy benchmarking leverages Hey APM to run daily benchmarks that aim to measure the overal APM Server's throughput covering a variety of cases, all of which are generated with the APM Go agent at the same time the benchmark is executed, limitting the complexity and variety of data that is generated for the benchmark scenarios. The results of these benchmarks are then indexed into Elasticsearch weekly reports that compare the current results against the last week, month and 3 months are reported in Slack.

The new benchmarking framework using apmbench uses pre-recorded APM Agent events for the benchmarks. This allows us to generate richer event which can be used to assess the Server's throughput. The APM Integration testing is used to generate the events, and intake-receiver will capture the events that are sent to the intake API. apmbench will also generate additional metrics compared to Hey APM, allowing for better understanding of where any potential bottlenecks may be, and how APM Server consumes the available resources.

TODO(marclop): convert the dot diagrams from dot to mermaid so they can be read in Markdown documents

The applications that are used to generate the stored traces, may not always use the apm-integration-testing, instead, we may want to write specific applications that generate a specific type of events, rather than re-use the existing opbeans applications.

Re-generate captured events

The events are currently commited in the apm-server repository (apm-server/systemtest/benchtest/events). This may change in the near future, and instead, we'll download the stored traces on-demand and upload/update them periodically.

# Navigate to your local copy of 'elastic/apm-integration-testing'.
$ SLEEP=180 STACK_VERSION=8.1.2 RPM=5000; ./scripts/compose.py start $STACK_VERSION --opbeans-go-loadgen-rpm ${RPM} --opbeans-python-loadgen-rpm ${RPM} --opbeans-node-loadgen-rpm $((${RPM} * 2)) --opbeans-ruby-loadgen-rpm ${RPM} --with-opbeans-go --no-apm-server-self-instrument --with-opbeans-python --with-opbeans-ruby --with-opbeans-node --apm-server-record --loadgen-no-ws && sleep $SLEEP && make copy-events; docker-compose down
...
# Copy the generated traces to the location where `apmbench` expects them to be (`apm-server/systemtest/benchtest/events`).
# Assuming that the `apm-server` repository has been checked out at the same level as `apm-integration-testing`.
$ cp -r events ../apm-server/systemtest/benchtest/events

Running apmbench

apmbench is located in systemtest/cmd/apmbench and can target any APM Server with apm-server.expvar.enabled set to true to be able to calculate basic throughput measurements, but apm-server.pprof.enabled should also be set to true if any of -blockprofile, -cpuprofile, -memprofile or -mutexprofile flags are set.

The default behavior of apmbench is to send the captured events to the target APM Server as fast as possible with the configured number of -agents. The -agents flag determines how many concurrent goroutines will be used to send the events to the APM Server in parallel. The -max-rate can be used to specify rate of events, as eps or epm to send to the APM server instead of the default behaviour. To benchmark the APM Server in setup similar to what we'd see in production, the number of agents should be high (>500).

By default, apmbench will warm up the APM Server by sending N events to the APM Server before any of the benchmark scenarios are run. That N can be configured via -warmup-events and defaults to a conservative number.

The default -benchtime is 1s which, for our purposes isn't a great default, so if you're benchmarking changes to the APM Server you'll want to set the duration to at least 30s to have some quick feedback, our periodic benchmarks should aim to benchmark for longer to allow any long-queue effects to be detected.

The rest of the flags configure the apmbench so it can target an APM Server, these can be configured via the set flags, or their ELASTIC_APM_<UPPERCASE FLAG NAME> alternative, for example, to configure the server URL set ELASTIC_APM_SERVER_URL to the full URL of the APM Server you'd like to benchmark.

Soak testing

Soak testing involves testing apm-server against a continuous, sustained workload to identify performance and stability issues that occur over an extended period. apmsoak command can be used to generate a sustained and continuous load for the purpose of soak testing:

$ cd systemtest/cmd/apmsoak
$ go run main.go -h
Usage of /var/folders/k9/z1yw8fsn0sjbl5yy7z2rsdpr0000gn/T/go-build4164012609/b001/exe/main:
  -agents-replicas int
    	Number of agents replicas to use, each replica launches 4 agents, one for each type (default 1)
  -max-rate value
    	Max event rate as epm or eps with burst size=max(1000, 2*eps), <= 0 values evaluate to Inf (default 0epm)
  -secret-token string
    	secret token for APM Server
  -secure
    	validate the remote server TLS certificates
  -server value
    	apm-server URL (default http://localhost:8200)

Manual testing

Often, we need to manually test the integration between different features, PR testing or pre-release testing. Our docker-compose.yml contains the basic components that make up the Elastic Stack for the APM Server.

Testing Stack monitoring

APM Server publishes a set of metrics that are consumed either by Metricbeat or sent by the APM Server to an Elasticsearch cluster. Some of these metrics are used to power the Stack Monitoring UI. The stack monitoring setup is non trivial and has been automated in testing/stack-monitoring.sh. The script will launch the necessary stack components, modify the necessary files and once finished, you'll be able to test or ensure that Stack Monitoring is working as expected.

Note that the testing/stack-monitoring.sh script relies on systemtest/cmd/runapm, and will use a locally built version of APM Server (see more information below).

Running an Elastic Agent container with a locally built APM Server

APM Server can be run in either standalone or managed mode by the ELastic Agent. To facilitate manual testing of APM Server in managed mode, it is possible to inject a locally built apm-server binary via systemtest/cmd/runapm. It requires having the apm-server docker-compose project running and creates the required fleet policies, and exposes the APM Server port using a random binding that is printed to the standard output after the container has started.

$ cd systemtest/cmd/runapm
$ go run main.go -h
Usage of /var/folders/35/r4w8sbqj2md1sg944kpnzyth0000gn/T/go-build3644709196/b001/exe/main:
  -arch string
    	The architecture to use for the APM Server and Docker Image (default runtime.GOARCH)
  -d    If true, runapm will exit after the agent container has been started
  -f    Force agent policy creation, deleting existing policy if found
  -keep
        If true, agent policy and agent will not be destroyed on exit
  -name string
        Docker container name to use, defaults to random
  -namespace string
        Agent policy namespace (default "default")
  -policy string
        Agent policy name (default "runapm")
  -reinstall
        Reinstall APM integration package (default true)
  -var value
        Define a package var (k=v), with values being YAML-encoded; can be specified more than once

Building an Elastic Agent container image with a locally built APM Server

It's possible to run runapm (as pictured above) and re-use the image that runapm builds to use in docker-compose files or run in ECE / ESS. However, it's also possible to only build a docker image without requiring the docker-compose project containers to be up and running with systemtest/cmd/buildapm.

buildapm reads the docker-compose.yml at the root of the repository and uses that information to build an Elastic Agent docker image with an APM Server bundled that contains any local changes you might have made.

By default, the amd64 architecture (or platform in Docker lingo) will be used. This may not be ideal if you run a machine with a different architecture than amd64, but you can specify the -arch flag.

Additionally, if -cloud is set, the Elastic Agent cloud image will be used as the base image, so changes can be packaged and tested in ESS / ECE (See our internal documentation on these for how to use them).

$ cd systemtest/cmd/buildapm
$ go run main.go -arch arm64
2022/05/05 17:50:18 Building elastic-agent-systemtest:8.3.0-e4aa1f83-SNAPSHOT (arm64) from docker.elastic.co/beats/elastic-agent:8.3.0-e4aa1f83-SNAPSHOT...
2022/05/05 17:50:18 Building apm-server...
2022/05/05 17:50:18 Built /Users/marclop/repos/elastic/apm-server/build/apm-server-linux
2022/05/05 17:50:25 Built elastic-agent-systemtest:8.3.0-e4aa1f83-SNAPSHOT (arm64)
$  go run main.go -arch amd64
2022/05/05 17:50:35 Building elastic-agent-systemtest:8.3.0-e4aa1f83-SNAPSHOT (amd64) from docker.elastic.co/beats/elastic-agent:8.3.0-e4aa1f83-SNAPSHOT...
2022/05/05 17:50:35 Building apm-server...
2022/05/05 17:50:43 Built /Users/marclop/repos/elastic/apm-server/build/apm-server-linux
2022/05/05 17:50:49 Built elastic-agent-systemtest:8.3.0-e4aa1f83-SNAPSHOT (amd64)
# go run main.go -cloud
2022/05/19 11:08:04 Building image elastic-agent-systemtest:8.3.0-e4aa1f83-SNAPSHOT (amd64) from docker.elastic.co/cloud-release/elastic-agent-cloud:8.3.0-e4aa1f83-SNAPSHOT...
2022/05/19 11:08:04 Building apm-server...
2022/05/19 11:08:04 Built /Users/marclop/repos/elastic/apm-server/build/apm-server-linux
2022/05/19 11:09:07 Built image elastic-agent-systemtest:8.3.0-e4aa1f83-SNAPSHOT (amd64)