-
Notifications
You must be signed in to change notification settings - Fork 0
KVMAutotest WritingNewTests
As mentioned in previous articles, KVM autotest is a fairly large piece of infrastructure, and it's not really obvious how things do fit together. So let's give a bit of context on how KVM autotest fits on the bigger picture.
So, what is this big blob called 'autotest'? In short, it is a collection of user space programs and libraries written mainly in python, that intend to assist in implementing fully autotmated testing under the linux platform. It tries as much as possible to be distro agnostic, assuming we are running on a linux kernel. In fact, autotest was designed to test the linux kernel in the first place. Of course, things are generic enough to automate and execute several types of userspace testing.
Autotest is composed of some modules. The intention here is just to give a cursory look to what this collection of programs offer. If you have an autotest checkout, you'll be able to identify the relevant directories. If you don't, check this out:
- autotest client (subdir client): This is the code that executes tests in machines (test harness). It has code to prepare the machine, collect relevant system stats, compile tests, execute them and record the test results. The subdir client/tests contains other sub directories, test modules, which are python classes with intelligence to compile, execute and store test results. KVM autotest is, in fact, the kvm test module of the autotest client. We'll dive more into this subject later. For all practical purposes, the client is the only part of autotest that most of us will have to care about.
- autotest server (subdir server): This code contains utilities to push the autotest client to test clients, only if we are using the larger autotest infrastructure to run tests on a test grid. People will have to care about it only if they are involved with maintaining a test grid.
- autotest scheduler (subdir scheduler): Code that instantiate autotest server instances on appropriate times, handling the machine grid and dispatching the right test jobs to the right test machines.
- autotest web interface (subdir frontend): Web interface that allows users to trigger test jobs.
- autotest command line interface (subdir cli): CLI interface that also allows users to trigger test jobs.
So, a test module is, most of the time a simple, wrapper code that
guides autotest on how to deal with a given test suite. They are
implemented as python classes. For example, the simplest client side
test is called sleeptest
, and anyone can inspect its contents:
[lmr@freedom sleeptest]$ ls control sleeptest.py
You can see 2 files, sleeptest.py, and control. Let's explain better what each one of them mean.
So sleeptest.py
is the test module. it does explain autotest how to
run the test. Sleeptest as you might have guessed is a dummy sample test
that only instructs the machine under test to sleep for 1 second.
import time from autotest_lib.client.bin import test class sleeptest(test.test): version = 1 def run_once(self, seconds=1): time.sleep(seconds)
That is all. The run_once is the method that explains what is the actual test step to autotest.
All autotest tests does contain a sample control file. What is a control file? It's another python program that describes all the operations the autotest client should perform in a machine as part of a test job. So we can effectively say it is the transcription of what the user wants to do with a machine during a test job.
What do we want to do during a test job?: Well, most of the time, we want to run tests, although this is a python program itself, which means you can do anything you would inside a python program. No, really, anything. As most of the time we want to do simple stuff, a simple control file is really simple. In this case:
AUTHOR = "Autotest Team" NAME = "Sleeptest" TIME = "SHORT" TEST_CATEGORY = "Functional" TEST_CLASS = "General" TEST_TYPE = "client" DOC = """ This test simply sleeps for 1 second by default. It's a good way to test profilers and double check that autotest is working. The seconds argument can also be modified to make the machine sleep for as long as needed. """ job.run_test('sleeptest', seconds = 1)
Apart from the documentation variables, what this thing does is resumed
by job.run_test('sleeptest', seconds = 1)
. This tells, 'autotest,
please run the test defined in the module 'sleeptest', and pass the
parameter seconds with value of 1 to it'. Really simple.
The kvm test, which is pretty much kvm autotest, is the biggest test, that does more stuff, most of it programmed as autotest libraries themselves so the code can be reused by other virt technologies. We pushed autotest APIs forward and improved the autotest core by writing it. Therefore, it is indeed a lot bigger and more complicated than a sleeptest. However, it can do far more stuff, although it follows the same rules as sleeptest. It does have a test class and a control file, although bigger, more complicated.
When writing the kvm test, we had the following problem: qemu has a lot of command line options, kvm supports a lot of different guest os, tests might specify different NICS, different disk layout, buses and stuff. Also, we needed to test most of the combinations between these moving parts. A way to express all these combinations in a concise way was necessary.
Then the Cartesian Config file format was born. You can refer to its documentation to understand more how it works, but one thing is sufficient to say here - The config parser is going to generate a list of dictionaries, and each dictionary represents the parameters being passed to an instance of the kvm test. In the end, what happens inside the kvm control file is a code that runs something similar to:
for p in list_params: job.run_test('kvm', params=p)
Of course, the actual code is slightly different, but bear with me.
The kvm test class, kvm.py
is just a loader of its subtests, which
are located in 2 directories.
-
$AUTOTEST_ROOT/virt/tests - These tests are written in a fairly virt technology agnostic way, so they can be used by other virt technologies testing. More specifically, they do not use the vm monitor.
[lmr@freedom tests]$ ls autotest.py ethtool.py guest_s4.py __init__.py iozone_windows.py linux_s3.py multicast.py nic_promisc.py shutdown.py vlan.py whql_submission.py boot.py file_transfer.py guest_test.py iofuzz.py jumbo.py lvm.py netperf.py ping.py softlockup.py.bak watchdog.py yum_update.py clock_getres.py fillup_disk.py image_copy.py ioquit.py kdump.py mac_change.py nicdriver_unload.py pxe.py stress_boot.py whql_client_install.py
-
$AUTOTEST_ROOT/client/tests/kvm/tests - These are tests that do use specific kvm infrastructure, specifically the qemu monitor. other virt technologies can't use it so they go here.
[lmr@freedom tests]$ cd $AUTOTEST_ROOT/client/tests/kvm/tests [lmr@freedom tests]$ ls balloon_check.py floppy.py migration_with_file_transfer.py nic_hotplug.py qmp_basic.py steps.py timedrift_with_reboot.py unattended_install.py boot_savevm.py __init__.py migration_with_reboot.py nmi_watchdog.py qmp_basic_rhel6.py stop_continue.py timedrift_with_stop.py unittest_kvmctl.py build.py ksm_overcommit.py module_probe.py pci_hotplug.py set_link.py system_reset_bootable.py trans_hugepage_defrag.py unittest.py cdrom.py migration_multi_host.py multi_disk.py physical_resources_check.py smbios_table.py timedrift.py trans_hugepage.py virtio_console.py enospc.py migration.py nic_bonding.py qemu_img.py stepmaker.py timedrift_with_migration.py trans_hugepage_swapping.py vmstop.py
Looking at the simplest test, client/virt/tests/boot.py, the boot test:
import time def run_boot(test, params, env): """ KVM reboot test: 1) Log into a guest 2) Send a reboot command or a system_reset monitor command (optional) 3) Wait until the guest is up again 4) Log into the guest to verify it's up again @param test: kvm test object @param params: Dictionary with the test parameters @param env: Dictionary with test environment. """ vm = env.get_vm(params["main_vm"]) vm.verify_alive() timeout = float(params.get("login_timeout", 240)) session = vm.wait_for_login(timeout=timeout) if params.get("reboot_method"): if params["reboot_method"] == "system_reset": time.sleep(int(params.get("sleep_before_reset", 10))) session = vm.reboot(session, params["reboot_method"], 0, timeout) session.close()
This is a simple test that:
- Picks a vm from the main env file, if there's one
- Verify whether it its process is alive
- Logs into it
- Reboots the vm and waits until it's back, by trying to log into the machine using a remote, nw connection for a given timeout.
If you check the base tests cfg file, you can identify the test variant for the boot test. We'll explain more about the meaning of the stuff on this variant in a minute.
- boot: install setup image_copy unattended_install.cdrom type = boot restart_vm = yes kill_vm_on_error = yes login_timeout = 240
So, the rules for a new test in kvm autotest are:
- Choosing a name. On the case above, it is boot.
- The test code has to be located in one of the 2 locations specified above, and named after, well, the test name. Example: boot.py
- The file boot.py has to implement at least one python function
named run_[test_name], which takes exactly 3 args, the same for
every single test. Example: run_boot.py
- test: KVM test object
- param: Dict with test configuration
- env: The env file that stores vms and installers, used by the kvm test to pick and store objects.
- A variant in tests_base.cfg.sample, with a variant with your test
name. This variant has to be located in the tests session of
tests_base.cfg. Just look for the boot variant and you'll find out
where it is. Note that the variant name/label does not need to
follow the name of your test, it can be something different, such as
ponies
. Although most of the time you want them both to be the same to simplify things. You normally break this assumption if some namespace clash is going to happen (Example: the cdrom test had a namespace clash with unattended_install.cdrom, so we named the variant cdrom_test). The only important thing is thattype
must have the name of your test. The variable type is effectively what allows the kvm test to locate your test code and run it, the name of the variant itself is just an arbitrary label.
Now, let's go and write our uptime test, which only purpose in life is to pick up a living guest, connect to it via ssh, and return its uptime.
-
Git clone autotest to a convenient location, say $HOME/Code/autotest. See the download source documentation. Please do use git and clone the repo to the location mentioned.
-
Execute the
get_started.py
script (see [[the get started documentation | KVMAutotest-RunGetStartedScript]]. Whether to download the F15 DVD and winutils.iso, it depends on whether you want kvm autotest to take care of the installation for you. In case you don't, you might refer to our [[documentation on how to run tests using a custom made guest image | KVMAutotest-RunTestsExistingGuest]], and then you do not need to download no isos. -
Our uptime test won't need any kvm specific feature. Thinking about it, we only need a vm object and stablish an ssh session to it, so we can run the command. So we can store our brand new test under
$AUTOTEST_ROOT/client/virt/tests
. At the autotest root location:[lmr@freedom autotest]$ touch client/virt/tests/uptime.py [lmr@freedom autotest]$ git add client/virt/tests/uptime.py
-
Ok, so that's a start. So, we have at least to implement a function
run_uptime
. Let's start with it and just put the keyword pass, which is a no op. Our test will be like:def run_uptime(test, params, env): """ Docstring describing uptime. """ pass
-
Now, what is the API we need to grab a VM from our test environment? Our env object has a method,
get_vm
, that will pick up a given vm name stored in our environment. Some of them have aliases.main_vm
contains the name of the main vm present in the environment, which is, most of the time,vm1
.env.get_vm
returns a vm object, which we'll store on the variable vm. It'll be like this:def run_uptime(test, params, env): """ Docstring describing uptime. """ vm = env.get_vm(params["main_vm"])
-
Good. Now we grabbed a vm object. A vm object has lots of interesting methods, which we plan on documenting them more thoroughly, but for now, we want to ensure that this VM is alive and functional, at least from a qemu process standpoint. So, we'll call the method
verify_alive()
, which will verify whether the qemu process is functional and if the monitors, if any exist, are functional. If any of these conditions are not satisfied due to any problem, an exception will be thrown and the test will fail. This requirement is because sometimes due to a bug the vm process might be dead on the water, or the monitors are not responding.def run_uptime(test, params, env): """ Docstring describing uptime. """ vm = env.get_vm(params["main_vm"]) vm.verify_alive()
-
Next step, we want to log into the vm. the vm method that does return a remote session object is called
wait_for_login()
, and as one of the parameters, it allows you to adjust the timeout, that is, the time we want to wait to see if we can grab an ssh prompt. We have top level variablelogin_timeout
, and it is a good practice to retrieve it and pass its value towait_for_login()
, so if for some reason we're running on a slower host, the increase in one variable will affect all tests. Note that it is completely OK to just override this value, or pass nothing towait_for_login()
, since this method does have a default timeout value. Back to business, picking up login timeout from our dict of parameters:def run_uptime(test, params, env): """ Docstring describing uptime. """ vm = env.get_vm(params["main_vm"]) vm.verify_alive() timeout = float(params.get("login_timeout", 240))
-
Now we'll call
wait_for_login()
and pass the timeout to it, storing the resulting session object on a variable named session.def run_uptime(test, params, env): """ Docstring describing uptime. """ vm = env.get_vm(params["main_vm"]) vm.verify_alive() timeout = float(params.get("login_timeout", 240)) session = vm.wait_for_login(timeout=timeout)
-
The kvm test will do its best to grab this session, if it can't due to a timeout or other reason it'll throw a failure, failing the test. Assuming that things went well, now you have a session object, that allows you to type in commands on your guest and retrieve the outputs. So most of the time, we can get the output of these commands throught the method
cmd()
. It will type in the command, grab the stdin and stdout, return them so you can store it in a variable, and if the exit code of the command is != 0, it'll throw a aexpect.ShellError?. So getting the output of the unix command uptime is as simple as callingcmd()
with 'uptime' as a parameter and storing the result in a variable called uptime:def run_uptime(test, params, env): """ Docstring describing uptime. """ vm = env.get_vm(params["main_vm"]) vm.verify_alive() timeout = float(params.get("login_timeout", 240)) session = vm.wait_for_login(timeout=timeout) uptime = session.cmd('uptime')
-
If you want to just print this value so it can be seen on the test logs, just log the value of uptime using the logging library. Since that is all we want to do, we may close the remote connection, to avoid ssh/rss sessions lying around your test machine, with the method
close()
. Now, note that all failures that might happen here are implicitly handled by the methods called here. If a test went from its beginning to its end without unhandled exceptions, autotest assumes the test automatically as PASSed, no need to mark a test as explicitly passed. If you have explicit points of failure, for more complex tests, you might want to add some exception raising. But let's leave this for a later installment of kvm autotest tips.def run_uptime(test, params, env): """ Docstring describing uptime. """ vm = env.get_vm(params["main_vm"]) vm.verify_alive() timeout = float(params.get("login_timeout", 240)) session = vm.wait_for_login(timeout=timeout) uptime = session.cmd("uptime") logging.info("Guest uptime result is: %s", uptime) session.close()
-
Now, I deliberately introduced a bug on this code just to show you guys how to use some tools to find and remove trivial bugs on your code. I strongly encourage you guys to check your code with the script called run_pylint.py, located at the utils directory at the top of your $AUTOTEST_ROOT. This tool calls internally the other python tool called pylint to catch bugs on autotest code. I use it so much the utils dir of my devel autotest tree is on my $PATH. So, to check our new uptime code, we can call (important, if you don't have pylint install it with
yum install pylint
or equivalent for your distro):[lmr@freedom autotest]$ cd $AUTOTEST_ROOT [lmr@freedom autotest]$ utils/run_pylint.py client/virt/tests/uptime.py ************* Module client.virt.tests.uptime E0602: 10:run_uptime: Undefined variable 'logging'
-
Ouch. So there's this undefined variable called logging on line 10 of the code. It's because I forgot to import the logging library, which is a py library to handle info, debug, warning messages, on the first place. Fixing it and the code becomes:
import logging def run_uptime(test, params, env): """ Docstring describing uptime. """ vm = env.get_vm(params["main_vm"]) vm.verify_alive() timeout = float(params.get("login_timeout", 240)) session = vm.wait_for_login(timeout=timeout) uptime = session.cmd("uptime") logging.info("Guest uptime result is: %s", uptime) session.close()
-
Let's re-run
run_pylint.py
to see if it's happy with the code generated:[lmr@freedom autotest]$ utils/run_pylint.py client/virt/tests/uptime.py [lmr@freedom autotest]$
-
So we're good. Nice! Now, as good indentation does matter to python, we have another small utility called reindent.py, that will fix indentation problems, and cut trailing whitespaces on your code. Very nice for tidying up your test before submission.
[lmr@freedom autotest]$ utils/reindent.py client/virt/tests/uptime.py
-
I also run pylint itself to catch small things such as wrong spacing around operators and other subtle issues that go against PEP 8 and the coding style document. Please take pylint's output with a handful of salt, you don't need to work each and every issue that pylint finds, I use it to find unused imports and other minor things.
[lmr@freedom autotest]$ pylint client/virt/tests/uptime.py [lmr@freedom autotest]$ ... lots of output ...
-
Now, we need a variant on
subtests.cfg
to inform kvm autotest where the fancy new test is located. In actuality, to contribute it upstream, the variant needs to be written tosubtests.cfg.sample
. Just remember to mirror the file tosubtests.cfg
to make sure your instance of kvm autotest will be able to run it. Editingsubtests.cfg.sample
and adding the new variant (I chose to put it after the trans_hugepage variant, which is immediately before the destructive tests):- uptime: install setup image_copy unattended_install.cdrom type = uptime
-
All this means: the uptime variant depends on having a functional, installed guest, which makes it dependent of either one of ``install setup image_copy unattended_install.cdrom`` if they ran previously, and the test type of uptime is, ``uptime``. This will yield a functional git commit so you can send it to the mailing list.
-
Make sure your changes do reflect on the actual config file so you don't get caught:
[lmr@freedom autotest]$ cp client/tests/kvm/subtests.cfg.sample client/tests/kvm/subtests.cfg
-
Now you can test your code for real. First, you will use
tests.cfg
to inform kvm autotest you want to run your new test. Say you still don't have an installed F15 guest and you want kvm autotest to install it for you and then run the uptime test. So a good test set that will do this for you is very similar toqemu_kvm_f15_quick
. Let's call itqemu_kvm_f15_uptime
and make it look like:- @qemu_kvm_f15_uptime: qemu_binary = /usr/bin/qemu-kvm qemu_img_binary = /usr/bin/qemu-img only qcow2 only rtl8139 only ide only smp2 only no_pci_assignable only smallpages only Fedora.15.64 only unattended_install.cdrom, uptime
-
Make sure the last line of tests.cfg is
only qemu_kvm_f15_uptime
, please. If you want to use your own custom guest, then your uptime test set will look like:- @qemu_kvm_custom_uptime: qemu_binary = /usr/bin/qemu-kvm qemu_img_binary = /usr/bin/qemu-img only qcow2 only rtl8139 only ide only smp2 only no_pci_assignable only smallpages only CustomGuestLinux only uptime
-
Again, last line of the file has to be
only qemu_kvm_custom_uptime
. On the later case, running the config system will show you one and only one test:[lmr@freedom autotest]$ client/common_lib/cartesian_config.py client/tests/kvm/tests.cfg dict 1: smp2.CustomGuestLinux.uptime
-
Now, you can run your test to see if everything went good. If you don't want to see a lot of debug messages, you don't need to pass the --verbose flags to autotest. The debug logs are going to be written on disk anyway. So, let's do this:
[lmr@freedom autptest]$ sudo $AUTOTEST_ROOT/client/bin/autotest $AUTOTEST_ROOT/client/tests/kvm/control 16:33:55 INFO | Writing results to $AUTOTEST_ROOT/client/results/default 16:34:10 INFO | START ---- ---- timestamp=1312313650 localtime=Aug 02 16:34:10 16:34:11 INFO | Test 1: smp2.CustomGuestLinux.uptime 16:34:13 INFO | START kvm.smp2.CustomGuestLinux.uptime kvm.smp2.CustomGuestLinux.uptime timestamp=1312313653 localtime=Aug 02 16:34:13 16:34:32 INFO | Running qemu command: /usr/bin/qemu-kvm -name 'vm1' -nodefaults -vga std -monitor unix:'/tmp/monitor-humanmonitor1-20110802-163431-yxB6',server,nowait -serial unix:'/tmp/serial-20110802-163431-yxB6',server,nowait -drive file='/tmp/kvm_autotest_root/images/custom_image_linux.qcow2',index=0,if=ide,cache=none -device rtl8139,netdev=idS1DK4v,mac='9a:46:5b:6d:86:38',id='idrp8fu8' -netdev tap,id=idS1DK4v,fd=21 -m 1024 -smp 2 -vnc :0 16:35:53 INFO | Guest uptime result is: 15:35:51 up 1 min, 1 user, load average: 4.92, 1.37, 0.47 16:36:06 INFO | GOOD kvm.smp2.CustomGuestLinux.uptime kvm.smp2.CustomGuestLinux.uptime timestamp=1312313766 localtime=Aug 02 16:36:06 completed successfully 16:36:06 INFO | END GOOD kvm.smp2.CustomGuestLinux.uptime kvm.smp2.CustomGuestLinux.uptime timestamp=1312313766 localtime=Aug 02 16:36:06 16:36:13 INFO | END GOOD ---- ---- timestamp=1312313773 localtime=Aug 02 16:36:13
-
You can check a nice html job report by pointing out your browser to the
client/results/default/job_report.html
file produced by autotest.[lmr@freedom autotest]$ firefox client/results/default/job_report.html &
-
Ok, so now, we have something that can be git commited and sent to the mailing list
[lmr@freedom autotest-git]$ git diff diff --git a/client/tests/kvm/tests_base.cfg.sample b/client/tests/kvm/tests_base.cfg.sample index 5bf8762..2c72a45 100644 --- a/client/tests/kvm/tests_base.cfg.sample +++ b/client/tests/kvm/tests_base.cfg.sample @@ -1074,6 +1074,9 @@ variants: dd_timeout = 900 check_cmd_timeout = 900 + - uptime: install setup image_copy unattended_install.cdrom + type = uptime + # system_powerdown, system_reset and shutdown *must* be the last ones # defined (in this order), since the effect of such tests can leave # the VM on a bad state. diff --git a/client/virt/tests/uptime.py b/client/virt/tests/uptime.py index e69de29..6366190 100644 --- a/client/virt/tests/uptime.py +++ b/client/virt/tests/uptime.py @@ -0,0 +1,14 @@ +import logging + + +def run_uptime(test, params, env): + """ + Docstring describing uptime. + """ + vm = env.get_vm(params["main_vm"]) + vm.verify_alive() + timeout = float(params.get("login_timeout", 240)) + session = vm.wait_for_login(timeout=timeout) + uptime = session.cmd("uptime") + logging.info("Guest uptime result is: %s", uptime) + session.close()
-
Oh, we forgot to add a decent docstring description. So doing it:
import logging def run_uptime(test, params, env): """ Uptime test for virt guests: 1) Boot up a VM. 2) Stablish a remote connection to it. 3) Run the 'uptime' command and log its results. @param test: KVM test object. @param params: Dictionary with the test parameters. @param env: Dictionary with test environment. """ vm = env.get_vm(params["main_vm"]) vm.verify_alive() timeout = float(params.get("login_timeout", 240)) session = vm.wait_for_login(timeout=timeout) uptime = session.cmd("uptime") logging.info("Guest uptime result is: %s", uptime) session.close()
-
git commit signing it, put a proper description, then send it with git send-email. Profit!
Next week we're going to cover other subjects asked by our community members. Stay tuned!
Please give us feedback on whether this procedure was helpful - email me at lmr AT redhat DOT com.