Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: Ingore instance's consumed resources on resize #203

Open
wants to merge 3 commits into
base: stable/rocky-m3
Choose a base branch
from

Conversation

joker-at-work
Copy link

When getting an allocation-candidate from placement, we don't want to
double-spend the resources on the same host, because the VM will only
run once. Therefore, we pass "ignore_consumer" for the current
instance_uuid to placement's allocation_candidates/ endpoint. This has
the advantage, that we can resize on the same host, even if there are
only enough resources available for fitting in the VM once and don't
have to explain to customers that a resize is not possible, because
there are not enough resources available.

Change-Id: Ie068a4a22edc37aa1e1173be9b2e7823fd3c5890

nova/scheduler/manager.py Outdated Show resolved Hide resolved
mariusleu
mariusleu previously approved these changes Jun 3, 2021
nova/scheduler/manager.py Outdated Show resolved Hide resolved
nova/scheduler/utils.py Outdated Show resolved Hide resolved
@joker-at-work joker-at-work force-pushed the ignore-consumer-on-resize branch from b9d477a to 4dacb86 Compare June 3, 2021 10:19
@joker-at-work joker-at-work requested a review from grandchild June 3, 2021 10:19
grandchild
grandchild previously approved these changes Jun 3, 2021
@joker-at-work joker-at-work marked this pull request as draft June 4, 2021 06:34
@joker-at-work
Copy link
Author

Found out the depending patch with request_is_resize() doesn't work and we need to ignore the migration's uuid instead. Afterwards, we use the placement-query as expected, but it still doesn't work as we want to, because a full host would still double-spend.

Instead of passing the "_nova_check_type" scheduler-hint on in the
dictionary containing the "filter_properties", we set the
"scheduler_hints" attribute on the request-spec object instead. This is
necessary, because we will look at this object later with
nova.scheduler.utils.request_is_resize and is in line with how a rebuild
is marked in Nova.

Passing "_nova_check_type" through the "scheduler_hint" parameter
doesn't work, because only certain values from this dictionary are later
on used by the conductor and we thus wouldn't recognize a resize at all.

Change-Id: If4aa0c29ef04e7b4b9a9baef40206b4a553ef415
This function is similar to set_and_clear_allocations in that it's meant
to update the instance's allocations at the same time as the migration's
allocations, making it an atomic update. The difference is, that this
function can update the migration's allocations with the supplied data,
which is supposed to be used in same-host resizes, where we don't have
to reserve all the resources twice.

Change-Id: I85ed3611392e6ed9521465dadb96605220863cda
@joker-at-work joker-at-work force-pushed the ignore-consumer-on-resize branch from 4dacb86 to 521b5e8 Compare June 9, 2021 10:18
@joker-at-work joker-at-work marked this pull request as ready for review June 9, 2021 10:19
@joker-at-work
Copy link
Author

This is a major extension of the previous code, as the testing showed that the previous code only went maybe half the way. The commit-message was extended, too and should explain everything (hopefully). If not, please ask.

@joker-at-work joker-at-work dismissed stale reviews from grandchild and mariusleu June 14, 2021 07:40

outdated

@grandchild
Copy link

Just quickly: Typo in commit message: scheduler: Ingore instance's consumed resources on resize -- should be Ignore.

When getting an allocation-candidate from placement, we want to ignore
the currently-used resources of the VM we're resizing. Ignoring the
resources makes it possible to resize on the same host, even if the VM
would only fit there once.

Since the resource allocations have already moved to the Migration
object, we use its UUID in the "ignore_consumer" parameter when calling
/allocation_candidates in placement.

But since the resources are claimed by the Migration now, the
allocation-requests returned by placement can still not fit on the host.
Therefore, we not only claim those resources, but also update the
Migration's resources at the same time.

We don't remove all resource-allocations from the Migration, because we
need to keep resources reserved for a possible revert by the customer.
Therefore, we update the Migration's resources to keep the difference
between its resources and the instance's resources allocated. One
exception is DISK_GB, which needs to be kept allocated, as the disk will
be copied and thus allocated twice.

On reverting, we have lost some information that we didn't put into the
Migration's allocations, because the instance had more. Therefore, we
use the method of "nova-manage placement heal_allocations" to re-create
the alloations for the original flavor. This might be problematic, if
the instance had allocations against multiple providers.

Since we're ignoring the whole Migration consumer in placement,
double-spending on DISK_GB is not computed in and might make the resize
fail if the host is low in DISK_GB.

Additionally, the resize might fail if the host is overprovisioned, even
if we patch placement to allow swapping resources of consumers. We might
be able to accomodate for that in a future patch, though.

Change-Id: Ie068a4a22edc37aa1e1173be9b2e7823fd3c5890
@joker-at-work joker-at-work force-pushed the ignore-consumer-on-resize branch from 521b5e8 to 844369c Compare July 6, 2021 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants