Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests: basic implementation for testing against vfkit #427

Merged
merged 2 commits into from
Jan 9, 2025

Conversation

lstocchi
Copy link
Collaborator

This is the first patch to start adding support to run tests against a vfkit machine.

There is a lot of code that can be reused when testing on different virtualization frameworks and they were moved to a shared folder (test-utils). The test folder still contains the original qemu tests, while the new test-vfkit folder contains the tests dedicated to vfkit.

There is still some improvement to do as many other functions can be shared among those implementations but I wanted a feedback about this organization - if this is correct in golang or we want the code to be organized differently.

cc @cfergeau @evidolob

"github.com/onsi/gomega"
)

var _ = ginkgo.Describe("connectivity with vfkit", func() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we would be able to share all of these tests with the qemu ones, we start the hypervisor and once it's started we run the same tests against it, independently of whether it is vfkit or qemu.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Came back to this now... I think your comment is not valid anymore after having agreed to using different dirs? Because if we have different BeforeSuite on each hypervisor directory I don't think we can use the same basic_test file right? or you have an idea how to achieve it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you it would be great to remove all that duplicate code if we find a way

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a way to share the basic tests. So basically I put the It clauses inside a func in a shared file and then they are called inside the Describe callback on the actual hypervisor folder where the tests has to run.

If you have a better idea how to do this, I'm all ears 👍

@@ -1,6 +1,7 @@
package e2e
package e2e_utils
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: there's a lint failure here about the package naming. Do we need to have test-qemu, test-vfkit and test-utils.
Maybe we can have all in the same directory? If you want to separate shared code in multiple directories, I think I would go with subdirectories, something like:
test/ -> main testing code, can either start vfkit or qemu, and then run the DNS / ... tests against it
test/qemu -> the qemu-specific code
test/vfkit -> vfkit code
test/fcos -> fcos download, uncompression, ...
test/utils -> code which does not fit anywhere else

Can also be just test/utils, in which case it's similar to your e2e_utils package. Or all in the test/ dir

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add everything on same directory because I thought that this way if someone wants to test just one platform he/she can only focus on a directory . Each folder contains a custom BeforeSuite where we download a specific fcos. If we have multiple platforms could be expensive in terms of time/space restart everything every time. So if you focus just on vfkit, you only run the vfkit folder having the custom beforeSuite and that's it. Does it make sense or in golang it can be achieved differently?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, yes your goal makes sense as they are expensive. Let's keep it this way, we can always rework this if we realize later we want to use this differently. go test has this -run option to restrict the tests being run, but it's not necessarily really convenient for your use case:

        -run regexp
            Run only those tests, examples, and fuzz tests matching the regular
            expression. For tests, the regular expression is split by unbracketed
            slash (/) characters into a sequence of regular expressions, and each
            part of a test's identifier must match the corresponding element in
            the sequence, if any. Note that possible parents of matches are
            run too, so that -run=X/Y matches and runs and reports the result
            of all tests matching X, even those without sub-tests matching Y,
            because it must run them to look for those sub-tests.
            See also -skip.

Or we could imagine having an environment variable defining which hypervisors we want to run the tests on.
Separate dirs will be good enough for now :)

@lstocchi lstocchi force-pushed the test-vfkit branch 5 times, most recently from a669c5e to 404b44e Compare November 27, 2024 15:12
@lstocchi lstocchi requested a review from cfergeau November 27, 2024 15:28
@lstocchi lstocchi marked this pull request as ready for review November 27, 2024 15:28
@lstocchi
Copy link
Collaborator Author

So, this adds a first implementation of tests using vfkit. The only limitations is that, because the macOS GH runners do not support nested virtualization, we can just run the tests locally or using a custom runner. Better than nothing.
I'm gonna add further tests in following PRs. Also fixing the win tests in another PR if nobody is already looking at them.

@evidolob
Copy link
Contributor

I have tested this PR on my intel mac, tests run fine and much faster, comparing to qemu.
One thing to improve, changes uses vfkit, but it requires specific version( at least v0.6.0) to be able to run test. It would be nice if you add checks that vfkit is present and fail test if not)

@lstocchi
Copy link
Collaborator Author

I have tested this PR on my intel mac, tests run fine and much faster, comparing to qemu. One thing to improve, changes uses vfkit, but it requires specific version( at least v0.6.0) to be able to run test. It would be nice if you add checks that vfkit is present and fail test if not)

Right, good point. I'll update it 👍

@lstocchi
Copy link
Collaborator Author

@evidolob check added 👍

@evidolob
Copy link
Contributor

@lstocchi I face another issue, when I run tests two consecutive times, second run is failing(due to timeout). It seems that test(not sure tests or vfkit) is not clean up ignition.sock:

 time="2024-12-18T10:05:25+02:00" level=error msg="listen unix /var/folders/p3/pnj31p35351bnyj6vj8zhq1w0000gn/T/ignition.sock: bind: address already in use"

If I delete that manually tests are passed.

@lstocchi
Copy link
Collaborator Author

@lstocchi I face another issue, when I run tests two consecutive times, second run is failing(due to timeout). It seems that test(not sure tests or vfkit) is not clean up ignition.sock:

 time="2024-12-18T10:05:25+02:00" level=error msg="listen unix /var/folders/p3/pnj31p35351bnyj6vj8zhq1w0000gn/T/ignition.sock: bind: address already in use"

If I delete that manually tests are passed.

yes, it is a known issue. I open crc-org/vfkit#230 to handle this from vfkit as I thought it should be it to clean its mess. Once it is merged and vfkit updated it should work smoothly. Maybe I can update this PR to add this part for the moment.

@lstocchi
Copy link
Collaborator Author

@evidolob added a temporary code to clean the ignition socket file. We can remove this part once vfkit do it itself -> 70372bf#diff-8d257e4ad0b1e194ab60d25e2f3b6601e9cc54914ad425e14b42c26a11a8efaeR201-R204

@lstocchi lstocchi self-assigned this Dec 18, 2024
@lstocchi lstocchi force-pushed the test-vfkit branch 2 times, most recently from 5915906 to 30b9b58 Compare December 18, 2024 14:09
@evidolob
Copy link
Contributor

evidolob commented Dec 20, 2024

@lstocchi The last, minor thing, about this PR, I a bit worried about unarchiving VM image on every tests run, is that is really necessarily? Some times unarchiving takes longer then actual tests run, on my intel mac.
AFAIK before, we unpack only when it needed(when get new image version) and it seems that all works fine

@lstocchi
Copy link
Collaborator Author

@lstocchi The last, minor thing, about this PR, I a bit worried about unarchiving VM image on every tests run, is that is really necessarily? Some times unarchiving takes longer then actual tests run, on my intel mac. AFAIK before, we unpack only when it needed(when get new image version) and it seems that all works fine

Ok, now I recall why I didn't do that. Bc the image has already been modified and we cannot be sure it is ok for tests.

To verify that an image is ok we compare the calculated SHA256 with the one we expect it to be. The fact is that the SHA256 from the fedora website will be different from the one we calculate, because we already run the image in previous tests and we modified it (e.g using ignition). And also, for this reason, running a new set of tests on an old image could compromise the results (??).

So far, what we do is to maintain the compressed file in the tmp folder and uncompressed it every time. I could change it by deleting the compressed file and keep the original uncompressed file.... and copy/paste it to be used for every new tests run. Would it be ok? Or you have another idea? We would waste some disk space but we would be safe as tests will always run on a clean image and instead of uncompressing we will just copy it (should be faster??)

@evidolob
Copy link
Contributor

I don’t know what better solution could be. I think that current approach is OK as is, any time we can improve VM image handling.

@cfergeau
Copy link
Collaborator

cfergeau commented Jan 8, 2025

I could change it by deleting the compressed file and keep the original uncompressed file.... and copy/paste it to be used for every new tests run. Would it be ok? Or you have another idea? We would waste some disk space but we would be safe as tests will always run on a clean image and instead of uncompressing we will just copy it (should be faster??)

On macos, you can use unix.Clonefile with the uncompressed disk image, and throw away that cloned file at the end of the test. This should be fast and only use disk space for changed blocks.

@@ -59,6 +59,36 @@ var _ = ginkgo.Describe("port forwarding", func() {
gomega.Expect(string(out)).To(gomega.ContainSubstring("Hello from the host"))
})

ginkgo.It("should reach a http server on the host", func() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first commit apparently does a bit more than what is in the commit log, this test seems to be new, and the commit also adds support for uncompressing gz files (in addition to xz files)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated. I moved the decompression func to the second commit when it is actually used. Regarding this test, it was a duplicate so I just removed it. I think I copy/paste the one above to create a new one starting from that and then I forgot.

}

func IsHostPortAvailable(host string, port int) bool {
listener, err := net.Listen("tcp", fmt.Sprintf("%s:%d", host, port))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use net.JoinHostPort here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated 👍

@cfergeau
Copy link
Collaborator

cfergeau commented Jan 8, 2025

Apart from the minor comment about net.JoinHostPort, looks good to me.

@vyasgun
Copy link
Member

vyasgun commented Jan 9, 2025

Looks good to me. I ran the tests and wrote some of my own on top of this and everything works as expected.

@lstocchi
Copy link
Collaborator Author

lstocchi commented Jan 9, 2025

On macos, you can use unix.Clonefile with the uncompressed disk image, and throw away that cloned file at the end of the test. This should be fast and only use disk space for changed blocks.

I'll work on it in a separate issue so I don't block others PRs. Thanks for the suggestions 👍

Tests that are currently on the main branch only runs against a qemu VM. We have other use cases that needs to be tested like running against a vfkit VM.
This commit reorganizes the tests code a bit by moving the files that can be shared to support different implementation in their own folder.
The reasoning behind this is that every hypervisor should have its own beforeSuite func to download/run a specific VM image. By moving the utils files we can reuse the same code.

For the same reason the code targeting qemu is moved to the test-qemu folder. By doing so, we can run the tests within the test-qemu folder on the ubuntu workflow and, in future, when the nested virt will be enabled on github runners, the vfkit tests on macOS.

Signed-off-by: Luca Stocchi <[email protected]>
Adds a basic implementation for testing against a vfkit VM. Tests are based on the existing qemu version. It just changes the way the VM gets created/started.

It also adds a func to decompress .gz files as the fcos release for apple hypervisor is raw.gz format

Signed-off-by: Luca Stocchi <[email protected]>
@cfergeau
Copy link
Collaborator

cfergeau commented Jan 9, 2025

/lgtm
/approve

Copy link
Contributor

openshift-ci bot commented Jan 9, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cfergeau, evidolob, lstocchi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Jan 9, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit fe2f2d9 into containers:main Jan 9, 2025
7 checks passed
go test -timeout 20m -v ./...
.PHONY: test-linux
test-linux: gvproxy test-companion
go test -timeout 20m -v ./test-qemu
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test-linux is not running as many tests as what test was running, go test ./... was running all of these:

pkg/services/dns/dns_test.go
pkg/services/dns/hosts_file_test.go
pkg/tap/ip_pool_test.go
test-win-sshproxy/basic_test.go
test-win-sshproxy/suite_test.go
test/basic_test.go
test/efi_darwin_arm64_test.go
test/port_forwarding_test.go
test/suite_test.go

Imo we can keep a test target which runs go test ./... but add build tags to the vfkit tests so that they are only run on macos and/or use the -skip parameter together with go test ./... to have more control over what is being run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants