e2e test for pod completion and next pod start #458

tardieu · 2025-02-24T17:40:19Z

We test that we do not ungate more pods than we can fit on available gpus by launching 8 long-running 1g pods and checking exactly 7 are running (in a single gpu setup). We should extend such a test to:

confirm that when one of the running pods completes, the pending pod starts running;
verify the transition latency, i.e., that the pending pod starts running without delay.

asm582 · 2025-02-24T17:42:29Z

Thanks, this test case on KinD does what we ask in 1st sub-bullet point across two GPUs; one pod remains in scheduling gated:

instaslice-operator/test/e2e/e2e_test.go

Line 692 in ce1f522

It("should verify all 1g profiles of GPUs are consumed", func() {

tardieu · 2025-02-24T17:46:02Z

AFAIK this test only addresses point 0, i.e., one pod remains gated, not point 1, i.e., the gated pod eventually runs.

harche self-assigned this Feb 24, 2025

harche added the tech-preview label Feb 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

e2e test for pod completion and next pod start #458

e2e test for pod completion and next pod start #458

tardieu commented Feb 24, 2025 •

edited

Loading

asm582 commented Feb 24, 2025 •

edited

Loading

tardieu commented Feb 24, 2025 •

edited

Loading

e2e test for pod completion and next pod start #458

e2e test for pod completion and next pod start #458

Comments

tardieu commented Feb 24, 2025 • edited Loading

asm582 commented Feb 24, 2025 • edited Loading

tardieu commented Feb 24, 2025 • edited Loading

tardieu commented Feb 24, 2025 •

edited

Loading

asm582 commented Feb 24, 2025 •

edited

Loading

tardieu commented Feb 24, 2025 •

edited

Loading