Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add virt capacity benchmark test #180

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 70 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ This plugin is a very opinionated OpenShift wrapper designed to simplify the exe
Executed with `kube-burner-ocp`, it looks like:

```console
$ kube-burner-ocp help
$ kube-burner-ocp --help
kube-burner plugin designed to be used with OpenShift clusters as a quick way to run well-known workloads

Usage:
Expand All @@ -29,6 +29,8 @@ Available Commands:
pvc-density Runs pvc-density workload
udn-density-l3-pods Runs udn-density-l3-pods workload
version Print the version number of kube-burner
virt-capacity-benchmark Runs capacity-benchmark workload
virt-density Runs virt-density workload
web-burner-cluster-density Runs web-burner-cluster-density workload
web-burner-init Runs web-burner-init workload
web-burner-node-density Runs web-burner-node-density workload
Expand Down Expand Up @@ -86,7 +88,7 @@ kube-burner-ocp cluster-density-v2 --iterations=1 --churn-duration=2m0s --churn-
### metrics-endpoints.yaml

```yaml
- endpoint: prometheus-k8s-openshift-monitoring.apps.rook.devshift.org
- endpoint: prometheus-k8s-openshift-monitoring.apps.rook.devshift.org
metrics:
- metrics.yml
alerts:
Expand All @@ -97,7 +99,7 @@ kube-burner-ocp cluster-density-v2 --iterations=1 --churn-duration=2m0s --churn-
defaultIndex: {{.ES_INDEX}}
type: opensearch
- endpoint: https://prometheus-k8s-openshift-monitoring.apps.rook.devshift.org
token: {{ .TOKEN }}
token: {{ .TOKEN }}
metrics:
- metrics.yml
indexer:
Expand Down Expand Up @@ -387,6 +389,71 @@ Input parameters specific to the workload:
| dpdk-cores | Number of cores assigned for each DPDK pod (should fill all the isolated cores of one NUMA node) | 2 |
| performance-profile | Name of the performance profile implemented on the cluster | default |


## Virt Workloads

This workload family is a focused on Virtualization creating different objects across the cluster.

The different variants are:
- [virt-density](#virt-density)
- [virt-capacity-benchmark](#virt-capacity-benchmark).

### Virt Density

### Virt Capacity Benchmark

Test the capacity of Virtual Machines and Volumes supported by the cluster and a specific storage class.

#### Environment Requirements

In order to verify that the `VirtualMachine` completed their boot and that volume resize propagated successfully, the test uses `virtctl ssh`.
Therefore, `virtctl` must be installed and available in the `PATH`.

See the [Temporary SSH Keys](#temporary-ssh-keys) for details on the SSH keys used for the test

#### Test Sequence

The test runs a workload in a loop without deleting previously created resources. By default it will continue until a failure occurs.
Each loop is comprised of the following steps:
- Create VMs
- Resize the root and data volumes
- Restart the VMs
- Snapshot the VMs
- Migrate the VMs

#### Tested StorageClass

By default, the test will search for the `StorageClass` to use:

1. Use the default `StorageClass` for Virtualization annotated with `storageclass.kubevirt.io/is-default-virt-class`
2. If does not exist, use general default `StorageClass` annotated with `storageclass.kubernetes.io/is-default-class`
3. If does not exist, fail the test before starting

To use a different one, use `--storage-class` to provide a different name.

Please note that regardless to which `StorageClass` is used, it must:
- Support Volume Expansion: `allowVolumeExpansion: true`.
- Have a corresponding `VolumeSnapshotClass` using the same provisioner

#### Test Namespace

All `VirtualMachines` are created in the same namespace.

By default, the namespace is `virt-capacity-benchmark`. Set it by passing `--namespace` (or `-n`)

#### Test Size Parameters

Users may control the workload sizes by passing the following arguments:
- `--max-iterations` - Maximum number of iterations, or 0 (default) for infinite. In any case, the test will stop upon failure
- `--vms` - Number of VMs for each iteration (default 5)
- `--data-volume-count` - Number of data volumes for each VM (default 9)

#### Temporary SSH Keys

The test generated the SSH keys automatically.
By default, it stores the pair in a temporary directory.
Users may choose the store the key in a specified directory by setting `--ssh-key-path`

## Custom Workload: Bring your own workload

To kickstart kube-burner-ocp with a custom workload, `init` becomes your go-to command. This command is equipped with flags that enable to seamlessly integrate and run your personalized workloads. Here's a breakdown of the flags accepted by the init command:
Expand Down
108 changes: 108 additions & 0 deletions cmd/config/virt-capacity-benchmark/check.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
#!/usr/bin/env bash
COMMAND=$1
LABEL_KEY=$2
LABEL_VALUE=$3
NAMESPACE=$4
IDENTITY_FILE=$5
REMOTE_USER=$6
EXPECTED_ROOT_SIZE=$7
EXPECTED_DATA_SIZE=$8

# Wait up to ~60 minutes
MAX_RETRIES=126
# In the first reties use a shorter sleep
WAIT_TIMES=(1 2 3 5 10 20 30)

if virtctl ssh --help | grep -qc "\--local-ssh " ; then
LOCAL_SSH="--local-ssh"
else
LOCAL_SSH=""
fi

function get_vms() {
local namespace=$1
local label_key=$2
local label_value=$3

local vms
vms=$(kubectl get vm -n "${namespace}" -l "${label_key}"="${label_value}" -o json | jq .items | jq -r '.[] | .metadata.name')
local ret=$?
if [ $ret -ne 0 ]; then
echo "Failed to get VM list"
exit 1
fi
echo "${vms}"
}

function remote_command() {
local namespace=$1
local identity_file=$2
local remote_user=$3
local vm_name=$4
local command=$5

local output
output=$(virtctl ssh ${LOCAL_SSH} --local-ssh-opts="-o StrictHostKeyChecking=no" --local-ssh-opts="-o UserKnownHostsFile=/dev/null" -n "${namespace}" -i "${identity_file}" -c "${command}" --username "${remote_user}" "${vm_name}" 2>/dev/null)
local ret=$?
if [ $ret -ne 0 ]; then
return 1
fi
echo "${output}"
}

function check_vm_running() {
local vm=$1
remote_command "${NAMESPACE}" "${IDENTITY_FILE}" "${REMOTE_USER}" "${vm}" "ls"
return $?
}

function check_resize() {
local vm=$1

local blk_devices
blk_devices=$(remote_command "${NAMESPACE}" "${IDENTITY_FILE}" "${REMOTE_USER}" "${vm}" "lsblk --json -v --output=NAME,SIZE")
local ret=$?
if [ $ret -ne 0 ]; then
return $ret
fi

local size
size=$(echo "${blk_devices}" | jq .blockdevices | jq -r --arg name "vda" '.[] | select(.name == $name) | .size')
if [[ $size != "${EXPECTED_ROOT_SIZE}" ]]; then
return 1
fi

local datavolume_sizes
datavolume_sizes=$(echo "${blk_devices}" | jq .blockdevices | jq -r --arg name "vda" '.[] | select(.name != $name) | .size')
for datavolume_size in ${datavolume_sizes}; do
if [[ $datavolume_size != "${EXPECTED_DATA_SIZE}" ]]; then
return 1
fi
done

return 0
}

VMS=$(get_vms "${NAMESPACE}" "${LABEL_KEY}" "${LABEL_VALUE}")

for vm in ${VMS}; do
counter=0
for attempt in $(seq 1 $MAX_RETRIES); do
if ${COMMAND} "${vm}"; then
break
fi
if [ "${attempt}" -lt $MAX_RETRIES ]; then
if [ $counter -lt ${#WAIT_TIMES[@]} ]; then
wait_time=${WAIT_TIMES[$counter]}
else
wait_time=${WAIT_TIMES[-1]} # Use the last value in the array
fi
sleep "${wait_time}"
((counter++))
else
echo "Failed waiting on ${COMMAND} for ${vm}" >&2
exit 1
fi
done
echo "${COMMAND} finished successfully for ${vm}"
done
6 changes: 6 additions & 0 deletions cmd/config/virt-capacity-benchmark/templates/resize_pvc.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v1
kind: PersistentVolumeClaim
spec:
resources:
requests:
storage: {{ .storageSize }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
apiVersion: v1
kind: Secret
metadata:
name: "{{ .name }}-{{ .counter }}"
type: Opaque
data:
key: {{ .publicKeyPath | ReadFile | b64enc }}
14 changes: 14 additions & 0 deletions cmd/config/virt-capacity-benchmark/templates/vm-snapshot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
apiVersion: snapshot.kubevirt.io/v1beta1
kind: VirtualMachineSnapshot
metadata:
name: "{{ .name }}-{{ .counter }}-{{ .Replica }}"
labels:
{{range $key, $value := .snapshotLabels }}
{{ $key }}: {{ $value }}
{{end}}
spec:
deletionPolicy: delete
source:
apiGroup: kubevirt.io
kind: VirtualMachine
name: "{{ .name }}-{{ .counter }}-{{ .Replica }}"
106 changes: 106 additions & 0 deletions cmd/config/virt-capacity-benchmark/templates/vm.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
{{- $storageClassName := .storageClassName -}}
{{- $dataVolumeLabels := .dataVolumeLabels -}}
{{- $dataVolumeSize := (default "1Gi" .dataVolumeSize) -}}
{{- $name := .name -}}
{{- $counter := .counter -}}
{{- $replica := .Replica }}
{{- $accessMode := .accessMode -}}

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: "{{ $name }}-{{ $counter }}-{{ $replica }}"
labels:
{{range $key, $value := .vmLabels }}
{{ $key }}: {{ $value }}
{{end}}
spec:
dataVolumeTemplates:
- metadata:
name: "{{ $name }}-{{ $counter }}-{{ $replica }}-root"
labels:
{{range $key, $value := .rootVolumeLabels }}
{{ $key }}: {{ $value }}
{{end}}
spec:
source:
registry:
url: "docker://{{ .rootDiskImage }}"
storage:
accessModes:
- {{ $accessMode }}
storageClassName: {{ .storageClassName }}
resources:
requests:
storage: {{ default "10Gi" .rootVolumeSize }}
{{ range $dataVolumeIndex := .dataVolumeCounters }}
- metadata:
name: "{{ $name }}-{{ $counter }}-{{ $replica }}-data-{{ $dataVolumeIndex }}"
labels:
{{range $key, $value := $dataVolumeLabels }}
{{ $key }}: {{ $value }}
{{end}}
spec:
source:
blank: {}
storage:
accessModes:
- {{ $accessMode }}
storageClassName: {{ $storageClassName }}
resources:
requests:
storage: {{ $dataVolumeSize }}
{{ end }}
running: true
template:
spec:
accessCredentials:
- sshPublicKey:
propagationMethod:
noCloud: {}
source:
secret:
secretName: "{{ .sshPublicKeySecret }}-{{ .counter }}"
architecture: amd64
domain:
resources:
requests:
memory: {{ default "512Mi" .vmMemory }}
devices:
disks:
- disk:
bus: virtio
name: rootdisk
bootOrder: 1
{{ range $dataVolumeIndex := .dataVolumeCounters }}
- disk:
bus: virtio
name: "data-{{ $dataVolumeIndex }}"
{{ end }}
interfaces:
- name: default
masquerade: {}
bootOrder: 2
machine:
type: pc-q35-rhel9.4.0
networks:
- name: default
pod: {}
volumes:
- dataVolume:
name: "{{ .name }}-{{ .counter }}-{{ .Replica }}-root"
name: rootdisk
{{ range $dataVolumeIndex := .dataVolumeCounters }}
- dataVolume:
name: "{{ $name }}-{{ $counter }}-{{ $replica }}-data-{{ $dataVolumeIndex }}"
name: "data-{{ . }}"
{{ end }}
- cloudInitNoCloud:
userData: |
#cloud-config
chpasswd:
expire: false
password: {{ uuidv4 }}
user: fedora
runcmd: []
name: cloudinitdisk
Loading
Loading