Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAMS - 4: Adds Recommendation updater and MIG based recommendation mechanism #1465

Merged
merged 9 commits into from
Feb 18, 2025

Conversation

bharathappali
Copy link
Member

@bharathappali bharathappali commented Jan 20, 2025

Description

KAMS - Kruize Accelerator MIG-tuning Service

The Recommendation Updater is added to update the workloads which has MIG partitions created by Instaslice. We set the updater value at the time of recommendation generation and updater changes the resources once it's triggered

Type of change

  • Bug fix
  • New feature
  • Docs update
  • Breaking change (What changes might users need to make in their application due to this PR?)
  • Requires DB changes

How has this been tested?

Tests are pending due to unavailability of accelerator hardware

  • New Test X
  • Functional testsuite

Test Configuration

  • Kubernetes clusters tested on:

Checklist 🎯

  • Followed coding guidelines
  • Comments added
  • Dependent changes merged
  • Documentation updated
  • Tests added or updated

@bharathappali bharathappali self-assigned this Jan 20, 2025
@bharathappali bharathappali requested a review from dinogun January 20, 2025 08:21
@dinogun dinogun added this to the Kruize 0.4 Release milestone Jan 31, 2025
@dinogun
Copy link
Contributor

dinogun commented Feb 5, 2025

@bharathappali Please fix the conflicts

@dinogun
Copy link
Contributor

dinogun commented Feb 6, 2025

@bharathappali New conflicts, please fix

@@ -194,7 +194,7 @@ public static void markAcceleratorDeviceStatusToContainer(ContainerData containe
if (null == resultArray || resultArray.isEmpty()) {
// Need to alert that container max duration is not detected
// Ignoring it here, as we take care of it at generate recommendations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like we need to update the comment here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we need that comment as these functions markAcceleratorPartitionDeviceStatusToContainer & markAcceleratorDeviceStatusToContainer try to extract the accelerator details from prometheus and while checking the code we might add alerts (In future) thinking that it's missing here.

This comment helps in understanding that it's not required to alert the user twice about the same thing. Please correct me if I'm wrong.

Copy link
Contributor

@dinogun dinogun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for alpha level instaslice integration

@dinogun dinogun merged commit 8d41db9 into kruize:mvp_demo Feb 18, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants