Releases: koordinator-sh/koordinator
v0.6.1
Changelog
- 54ed9a5 Add pod uid to pod meta when failover (#344)
- 1328009 Use the structure as the key of the map instead of string. (#349)
- f81c89c [koord-runtime-proxy]: fix panic when no hook registered (#355)
- 42d695f add PodMigrationJob CRD proposal (#358)
- d1fb8c5 add descheduler framework proposal (#371)
- 7d46fad add fine-grained device scheduling proposal (#322)
- 82dc2ac add koord-descheduler (#425)
- 37a3aec add logs for proxy server (#329)
- 05a8c11 add pod annotations and labels to container request and cache (#362)
- 827bd6b add reservation plugin (#353)
- 78a4ebb add schedule gang md (#333)
- 993fc21 add scheduling framework extender (#365)
- 1cf37d0 add xiaohongshu as koordinator adopter (#424)
- c9cf1a4 api: add PodMigrationJob API (#375)
- 91cacc4 api: add device crd in scheduling group (#376)
- dab5a92 api: add device info into NodeMetric CRD (#378)
- 47e7189 api: update PodMigrationJob and Reservation CRD (#399)
- 74de8bd api: update reservation api (#384)
- bb3065a apis: add Gang api definition (#409)
- 0faf65e bugfix: always need to reset cpuset when cpu supress (#403)
- f0daee1 bugfix: avoid pod terminating in docker (#445)
- 1c44a0a bugfix: skip when pod sandbox not found (#444)
- fbf4d97 change qos func name for old format adaption reason (#418)
- 5b1ce9d clear cpuset of BE container to avoid conflict with kubelet static policy, using the value of besteffort dir (#412)
- 6e0d88f cri-runtime-proxy: fix containerErr error when failOver pods and containers (#414)
- 6918290 feat(deps): bump github.com/stretchr/testify from 1.7.5 to 1.8.0 (#326)
- 3fce836 feat(deps): bump google.golang.org/protobuf from 1.28.0 to 1.28.1 (#419)
- d763879 feat(deps): bump gorm.io/driver/sqlite from 1.3.4 to 1.3.6 (#347)
- f32a0ba feat(deps): bump gorm.io/gorm from 1.23.6 to 1.23.8 (#351)
- bed2191 feat(deps): bump sigs.k8s.io/yaml from 1.2.0 to 1.3.0 (#427)
- 5b320c0 feat: add gpu metrics to crd (#397)
- 4301cc9 feat: collect gpu metrics (#361)
- 488f8d5 feature: report pod alloc of Guaranteed pod and cpu manager policy (#386)
- b54bb0c fix auditor test in MacOS (#379)
- 5bcb7a7 fix koord-descheduler initialize profile error (#432)
- ecead7c fix reservation on mutil-scheduler (#431)
- 9e8fc01 fix reservation on pod patch failed (#428)
- b2fcc22 fix the loss of new updated resources from UpdateContainerResources request (#363)
- 0523d60 fix: consider lse/lsr when cpu suppress (#234) (#372)
- bf308ed fix: remove inline tag for corev1.ResourceList to fix #390 (#391)
- 6ac04d4 improve koordlet log verbosity (#338)
- a89cd98 koord-descheduler: implement PodMigrationJob controller (#404)
- 78afa0a koord-descheduler: implement descheduling configuration (#422)
- 49fa42c koord-descheduler: implement descheduling framework (#423)
- 3ed131c koord-descheduler: release Reservation when PodMigrationJob completes or is deleted (#438)
- 9eb7b7d koord-scheduler: compatible with Pods using kubelet static CPU manager policy (#433)
- c9ad604 koord-scheduler: improve reservation validation (#442)
- b78243b koord-scheduler: support CPU exclusive policy (#359)
- 8179245 koord-scheduler: support Node CPU orchestration API (#360)
- 1ab5c99 koord-scheduler: support default preferredCPUBindPolicy for LSE/LSR Pod if not specified (#354)
- 1e77f1f koord-scheduler: support kubelet cpu manager policy (#434)
- 171ad3e koordlet: define GPU metric struct (#343)
- 7442bc5 koordlet: fix build error on macOS caused by GPU (#413)
- 779ac80 koordlet: introduce Accelerators feature gate for GPU related features (#393)
- 91d2a4b koordlet: optimize auditor UT with httptest.Server (#382)
- 283c883 koordlet: refine initJiffies with default value (#367)
- 7510a3a make slo configmap name configurable (#415)
- b8dd567 rename resourceQoS to resourceQOS (#339)
- 0d9d9d4 style: unify the command parameter style of koordlet (#348)
- d0194b2 turn on pleg (#394)
v0.6.0
What's Changed
- add logs for proxy server by @zwzhang0107 in #329
- chore: remove useless feature-gates by @saintube in #336
- ci: enable CGO when GoReleaser compiles binaries by @jasonliu747 in #334
- rename resourceQoS to resourceQOS by @zwzhang0107 in #339
- improve koordlet log verbosity by @saintube in #338
- Add pod uid to pod meta when failover by @cheimu in #344
- cleanup: Use the structure as the key of the map instead of string by @novahe in #349
- koordlet: define GPU metric struct by @jasonliu747 in #343
- koord-scheduler: support default preferredCPUBindPolicy for LSE/LSR P… by @eahydra in #354
- style: unify the command parameter style of koordlet by @jasonliu747 in #348
- add fine-grained device scheduling proposal by @buptcozy in #322
- [koord-runtime-proxy]: fix panic when no hook registered by @cheimu in #355
- koord-scheduler: support CPU exclusive policy by @eahydra in #359
- [koord-runtime-proxy] Add pod annotations and labels to container request and cache by @cheimu in #362
- [koord-runtime-proxy] fix the loss of new updated resources from UpdateContainerResources request by @cheimu in #363
- add scheduling framework extender by @saintube in #365
- koordlet: refine initJiffies with default value by @jasonliu747 in #367
- add PodMigrationJob CRD proposal by @eahydra in #358
- add proposal for gang scheduling by @buptcozy in #333
- Support node cpu orchestration api by @eahydra in #360
- chore: update dockerfile for each module by @jasonliu747 in #364
- feat(deps): bump github.com/stretchr/testify from 1.7.5 to 1.8.0 by @dependabot in #326
- feat(deps): bump gorm.io/driver/sqlite from 1.3.4 to 1.3.6 by @dependabot in #347
- chore: supply UT for pkg/util and pkg/util/system by @ZiMengSheng in #374
- api: add PodMigrationJob API by @eahydra in #375
- docs: remove redundant field in Device CRD by @jasonliu747 in #377
- api: add device CRD in scheduling group by @jasonliu747 in #376
- fix auditor test in MacOS by @hormes in #379
- koordlet: optimize auditor UT with httptest.Server by @ZiMengSheng in #382
- docs: add chinese version readme.md by @ZiMengSheng in #380
- fix: consider lse/lsr when cpu suppress (#234) by @ZYecho in #372
- api: add device info into NodeMetric CRD by @jasonliu747 in #378
- koordlet: support collecting GPU metrics from node/pod/container by @LambdaHJ in #361
- chore: cleanup resmanager by @saintube in #383
- api: update reservation api by @saintube in #384
- add descheduler framework proposal by @eahydra in #371
- feat(deps): bump gorm.io/gorm from 1.23.6 to 1.23.8 by @dependabot in #351
- fix: remove inline tag for corev1.ResourceList to fix #390 by @jasonliu747 in #391
- koordlet: Turn on pleg by @cheimu in #394
- feat: update GPU metrics in NodeMetric CRD by @LambdaHJ in #397
- bugfix: always need to reset cpuset when cpu supress by @ZYecho in #403
- feature: report pod alloc of Guaranteed pod and cpu manager policy by @ZYecho in #386
- api: update PodMigrationJob and Reservation CRD by @eahydra in #399
- koordlet: introduce
Accelerators
feature gate for GPU related features by @jasonliu747 in #393 - koordlet: fix build error caused by GPU by @eahydra in #413
- cri-runtime-proxy: fix containerErr error when failOver pods and cont… by @lx1036 in #414
- make slo configmap name configurable by @zwzhang0107 in #415
- clear cpuset of BE container to avoid conflict with kubelet static po… by @zwzhang0107 in #412
- change qos func name for old format adaption reason by @zwzhang0107 in #418
- docs: add ADOPTERS.md of Koordinator by @jasonliu747 in #392
- koord-descheduler: implement descheduling configuration by @eahydra in #422
- chore: execute staticcheck instead of github action by running golang… by @eahydra in #421
- koord-scheduler: add reservation plugin by @saintube in #353
- koord-descheduler: implement descheduling framework by @eahydra in #423
- [adopter] add xiaohongshu as koordinator adopter by @cheimu in #424
- add koord-descheduler by @eahydra in #425
- fix reservation on pod patch failed by @saintube in #428
- koord-descheduler: implement PodMigrationJob controller by @eahydra in #404
- fix reservation on mutil-scheduler by @saintube in #431
- fix koord-descheduler initialize profile error by @eahydra in #432
- api: add Gang api by @Wenshiqi222 in #409
- koord-scheduler: compatible with Pods using kubelet static CPU manager policy by @eahydra in #433
- koord-scheduler: support kubelet cpu manager policy by @eahydra in #434
- docs: add maturity level in adopters.md by @jasonliu747 in #426
- feat(deps): bump google.golang.org/protobuf from 1.28.0 to 1.28.1 by @dependabot in #419
- feat(deps): bump sigs.k8s.io/yaml from 1.2.0 to 1.3.0 by @dependabot in #427
- koord-descheduler: release Reservation when PodMigrationJob completes or is deleted by @eahydra in #438
- koord-scheduler: improve reservation validation by @saintube in #442
New Contributors
- @buptcozy made their first contribution in #322
- @ZiMengSheng made their first contribution in #374
- @lx1036 made their first contribution in #414
- @Wenshiqi222 made their first contribution in #409
Full Changelog: v0.5.0...v0.6.0
v0.5.0
Changelog
- c4e2272 ci: use matrix and cache to speed up the build (#282) @jasonliu747
- 5a69a22 Add PreCreateContainerHook and PostStopSandboxHook interfaces and update their parameters (#231) @cheimu
- 1220b23 Add container id to ContainerResourceHookRequest (#243) @cheimu
- 1fa6ec8 Fix wrong cgroup path for PLEG (#325) @cheimu
- 108ad50 Implement cri scenario PreCreateContainerHook and PostStopPodSandboxHook (#239) @cheimu
- c063d0c Proposal QoS Manager (#262) @stormgbs
- 5d624d7 add container id to container info (#251) @cheimu
- a6b005f add cpuset allocator (#324) @zwzhang0107
- 8910d29 add defer os.Remove (#247) @cheimu
- 927483a add koordlet running mode design doc (#306) @zwzhang0107
- 1035de1 add more tests (#272) @hormes
- 178f086 add more tests for docker-proxy (#287) @ZYecho
- a49ab45 add reconciler for runtime hook standalone work mode (#319) @zwzhang0107
- 1f07f09 add resource reservation proposal (#241) @saintube
- 1caf1cd api: add reservation API (#276) @saintube
- 8289720 apis: update cpu scheduling plugin args and apis (#308) @eahydra
- 4bf748b bugfix: fix noderesource-controller, reporter reconcile on node deletion (#309) @saintube
- bff21a8 change cpuacct.stat to cpuacct.usage (#248) @j4ckstraw
- 8b99047 chore(deps): bump goreleaser/goreleaser-action from 2 to 3 (#299)
- d84b497 defines CPU orchestration APIs (#263) @eahydra
- 493a1ec docker-proxy: ensure cgroup parent for docker is valid (#281) @cheimu
- cef1028 docker-proxy: support createContainer and stopSandbox hook (#236) @ZYecho
- fec29f9 feat(deps): bump github.com/docker/docker (#223)
- 00b1657 feat(deps): bump github.com/fsnotify/fsnotify from 1.4.9 to 1.5.4 (#183)
- c5ee5fa feat(deps): bump github.com/google/uuid from 1.2.0 to 1.3.0 (#256)
- 852df5f feat(deps): bump github.com/prometheus/client_golang (#184)
- 74b8f03 feat(deps): bump github.com/stretchr/testify from 1.7.0 to 1.7.4 (#293)
- 6e097f9 feat(deps): bump github.com/stretchr/testify from 1.7.4 to 1.7.5 (#313)
- 2e35be6 feat(deps): bump go.uber.org/atomic from 1.7.0 to 1.9.0 (#278)
- 4ff7b4f feat(deps): bump google.golang.org/protobuf from 1.26.0 to 1.28.0 (#284)
- 46fd995 feat(deps): bump gorm.io/driver/sqlite from 1.3.1 to 1.3.4 (#257)
- 444d598 feat(deps): bump gorm.io/gorm from 1.23.3 to 1.23.6 (#245)
- 360fc88 feat(deps): bump k8s.io/klog/v2 from 2.9.0 to 2.10.0 (#291)
- 444b82d feat(deps): bump sigs.k8s.io/controller-runtime from 0.10.2 to 0.10.3 (#290)
- c6bc07f feat: add accessing kubelet with http option for koordlet (#304) @jasonliu747
- b21ea49 feat: add read only port support for koordlet (#320) @LambdaHJ
- 86097cc feature: report cpu info to noderesoucetopology (#312) @ZYecho
- 975bb93 fix: incorrect coversion of an integer (#242) @jasonliu747
- 92a862b fixed setup kubelet docs, which caused kubelet startup exception (#274) @JasonRD
- 5da05b3 koord-scheduler: compatible with Kubernetes v1.18 ~ v1.20 (#315) @eahydra
- ea82b20 koord-scheduler: implement NodeNUMAResource plugin with CPUSet scheduling (#289) @eahydra
- 503cd9d koord-scheduler: loadAwareScheduling skip node without NodeMetric in filter/score phase (#317) @eahydra
- d331a10 koord-scheduler: refactor framework extender init function (#307) @eahydra
- 9ae9b77 proposal: design fine-grained CPU orchestration (#209) @eahydra
- 5834b42 refactor scheduling config code path (#275) @eahydra
- 9287683 refactor: move runtime module under util (#266) @jasonliu747
- 4fa8400 refactor: prune socket before launching runtime-proxy (#250) @jasonliu747
- 690618e refactor: remove useless field in NodeSLOReconciler (#259) @jasonliu747
- d139458 refactor: replace reinvented wheel with Get() in standard library (#244) @jasonliu747
- 3362924 refactor: slo-controller use cache manage configmap (#305) @chzhj
- 9cbe6a9 regenerate runtime api to fix typo (#238) @cheimu
- 5a0c927 remove macos tests (#269) @hormes
- 8a5fee9 rename runtime-hook working mode (Bypass->Standalone) (#318) @novahe
- 4d7aa91 update podInfo and containerInfo with hook resp cgroupParent (#252) @cheimu
- 2d23602 use unified internal protocol for running hooks plugin (#283) @zwzhang0107
New Contributors
- @j4ckstraw made their first contribution in #248
- @dependabot made their first contribution in #183
- @JasonRD made their first contribution in #274
- @chzhj made their first contribution in #305
- @novahe made their first contribution in #318
Full Changelog: v0.4.1...v0.5.0
v0.4.1
What's Changed
- chore: update codecov.yaml by @saintube in #208
- fix: change directory of generate-runtime.sh by @jasonliu747 in #215
- Fix bug where runtime proxy cannot decode annotations in Docker config by @cheimu in #220
- feat: add kubelet http2 support by @LambdaHJ in #180
New Contributors
Full Changelog: v0.4.0...v0.4.1
v0.4.0
✨ Features and improvements:
- Introduce main for runtime-manager by @honpey in #171
- feature: support docker proxy by @ZYecho in #128
- feat(koordlet): support memoryEvictLowerPercent in memory evict by @shinytang6 in #132
- proposal load-aware scheduling plugin by @eahydra in #135
- koordlet: support cpu evict feature by @jasonliu747 in #169
- add group identity plugin by @zwzhang0107 in #166
🐛 Fixed bugs:
- fix(koordlet): fix be container memory request by @shinytang6 in #129
⏫ Merged pull requests:
- chore: add cache for staticcheck by @jasonliu747 in #130
- add koordlet runtime design by @zwzhang0107 in #123
- 🌱 add validation for CRD by @jasonliu747 in #133
- test(controller): add unit test for resource_calculator by @jasonliu747 in #137
- Modify memqos wmark ratio doc desc by @tianzichenone in #142
- test(controller): add unit test for
config/config.go
by @jasonliu747 in #134 - test(controller): add unit test for noderesource by @jasonliu747 in #138
- add scaffold of runtime hooks by @zwzhang0107 in #122
- test: use
T.TempDir
to create temporary test directory by @Juneezee in #151 - update LoadAwareScheduling proposal by @eahydra in #155
- chore: fix test tempdir generation by @saintube in #156
- koordlet: support NodeMetricCollectPolicy by @eahydra in #157
- add cpu qos and mv nodeslo informer to states informer by @zwzhang0107 in #153
- update codecov configuration by @saintube in #131
- koordlet: support collect BE CPU metric by @jasonliu747 in #158
- apis: introduce cpu evict fields in NodeSLO by @jasonliu747 in #161
- Add pod annotations/labels for container level hook by @honpey in #165
- fix build errors by @hormes in #160
- ci: support running unit test on multiple os by @jasonliu747 in #162
- style: format header to fix ci errors by @jasonliu747 in #167
- Introduce image service proxy under cri scenario by @honpey in #168
- runtime-manager: refactor codes about store and resource-exectutor by @honpey in #170
- Support load aware scheduling by @eahydra in #159
- koord-scheduler: update scheduler apis groupName by @eahydra in #173
- test: add ut for configmap_event_handler by @jasonliu747 in #176
- refactor tests in nodemetric package by @hormes in #175
- vendor: goodbye vendor by @jasonliu747 in #149
- test: add ut for node_event_handler by @jasonliu747 in #177
- fix: add CPU Evict check in isFeatureDisabled by @jasonliu747 in #179
- Add the runtime-manager design doc by @honpey in #178
- chore: introduce dependabot by @jasonliu747 in #181
- add more tests by @hormes in #182
- chore: remove additional cache in golangci-lint by @jasonliu747 in #192
- api: remove deprecated field in NodeSLO by @jasonliu747 in #191
- Rename runtime-manager to koord-runtime-proxy by @honpey in #195
- add more tests by @hormes in #194
- koord-runtime-proxy: add installation manual by @honpey in #198
- add ChangeLog for v0.4.0 by @eahydra in #200
🎉 New Contributors:
- @shinytang6 made their first contribution in #129
- @tianzichenone made their first contribution in #142
- @Juneezee made their first contribution in #151
- @ZYecho made their first contribution in #128
Full Changelog: v0.3.1...0.4.0
v0.3.1
v0.3.0
✨ Features and improvements :
- Support CPU burst strategy #52
- Support Memory QoS strategy #55
- Support LLC and MBA isolation strategy #56
- Protocol design between runtime-manager and hook server #62
- Improve overall code coverage from 39% to 56% #69
🐛 Fixed bugs:
- when deploy on ACK 1.18.1 koord-manager pod always crash #49
- Handle unexpected CPU info in case of koordlet panic #90
⏫ Merged pull requests:
- New feature: cpu burst strategy #73 (stormgbs)
- Introduce protocol between RuntimeManager and RuntimeHookServer #76 (honpey)
- Improve readme #88 (hormes)
- update image file format #92 (zwzhang0107)
- 🌱 add expire cache #93 (jasonliu747)
- ✨ support LLC & MBA isolation #94 (jasonliu747)
- fix cpuinfo panic on arm64 #97 (saintube)
- 📖 fix typo in docs #98 (jasonliu747)
- Introduce HookServer config loading from /etc/runtime/hookserver.d/ #100 (honpey)
- add memory qos strategy #101 (saintube)
- add an issue template and rename feature request to proposal #108 (hormes)
- Introduce cri request parsing/generate-hook-request/checkpoing logic #110 (honpey)
- 🌱 add unit test for resmanager #111 (jasonliu747)
- Add cpu suppress test and revise memory qos #112 (saintube)
- ✨ Remove deprecated go get from Makefile #116 (jasonliu747)
- 🌱 add license checker in workflow #117 (jasonliu747)
- Support cpu burst strategy #118 (stormgbs)
- 🌱 add unit test for memory evict feature #119 (jasonliu747)
- add UTs for runtime handler #125 (saintube)
- 📖 add changelog for v0.3 #126 (jasonliu747)
🎉 New Contributors
v0.2.0
Isolate resources for best-effort workloads
In Koodinator v0.2.0, we refined the ability to isolate resources for best-effort worklods.
koordlet
will set the cgroup parameters according to the resources described in the Pod Spec. Currently supports setting CPU Request/Limit, and Memory Limit.
For CPU resources, only the case of request == limit
is supported, and the support for the scenario of request <= limit
will be supported in the next version.
Active eviction mechanism based on memory safety thresholds
When latency-sensitiv applications are serving, memory usage may increase due to bursty traffic. Similarly, there may be similar scenarios for best-effort workloads, for example, the current computing load exceeds the expected resource Request/Limit.
These scenarios will lead to an increase in the overall memory usage of the node, which will have an unpredictable impact on the runtime stability of the node side. For example, it can reduce the quality of service of latency-sensitiv applications or even become unavailable. Especially in a co-location environment, it is more challenging.
We implemented an active eviction mechanism based on memory safety thresholds in Koodinator.
koordlet
will regularly check the recent memory usage of node and Pods to check whether the safty threshold is exceeded. If it exceeds, it will evict some best-effort Pods to release memory. This mechanism can better ensure the stability of node and latency-sensitiv applications.
koordlet
currently only evicts best-effort Pods, sorted according to the Priority specified in the Pod Spec. The lower the priority, the higher the priority to be evicted, the same priority will be sorted according to the memory usage rate (RSS), the higher the memory usage, the higher the priority to be evicted. This eviction selection algorithm is not static. More dimensions will be considered in the future, and more refined implementations will be implemented for more scenarios to achieve more reasonable evictions.
The current memory utilization safety threshold default value is 70%. You can modify the memoryEvictThresholdPercent
in ConfigMap slo-controller-config
according to the actual situation,
apiVersion: v1
kind: ConfigMap
metadata:
name: slo-controller-config
namespace: koordinator-system
data:
colocation-config: |
{
"enable": true
}
resource-threshold-config: |
{
"clusterStrategy": {
"enable": true,
"memoryEvictThresholdPercent": 70
}
}
v0.1.0
Node Metrics
Koordinator defines the NodeMetrics
CRD, which is used to record the resource utilization of a single node and all Pods on the node. koordlet will regularly report and update NodeMetrics
. You can view NodeMetrics
with the following commands.
$ kubectl get nodemetrics node-1 -o yaml
apiVersion: slo.koordinator.sh/v1alpha1
kind: NodeMetric
metadata:
creationTimestamp: "2022-03-30T11:50:17Z"
generation: 1
name: node-1
resourceVersion: "2687986"
uid: 1567bb4b-87a7-4273-a8fd-f44125c62b80
spec: {}
status:
nodeMetric:
nodeUsage:
resources:
cpu: 138m
memory: "1815637738"
podsMetric:
- name: storage-service-6c7c59f868-k72r5
namespace: default
podUsage:
resources:
cpu: "300m"
memory: 17828Ki
Colocation Resources
After the Koordinator is deployed in the K8s cluster, the Koordinator will calculate the CPU and Memory resources that have been allocated but not used according to the data of NodeMetrics
. These resources are updated in Node in the form of extended resources.
koordinator.sh/batch-cpu
represents the CPU resources for Best Effort workloads,
koordinator.sh/batch-memory
represents the Memory resources for Best Effort workloads.
You can view these resources with the following commands.
$ kubectl describe node node-1
Name: node-1
....
Capacity:
cpu: 8
ephemeral-storage: 103080204Ki
koordinator.sh/batch-cpu: 4541
koordinator.sh/batch-memory: 17236565027
memory: 32611012Ki
pods: 64
Allocatable:
cpu: 7800m
ephemeral-storage: 94998715850
koordinator.sh/batch-cpu: 4541
koordinator.sh/batch-memory: 17236565027
memory: 28629700Ki
pods: 64
Cluster-level Colocation Profile
In order to make it easier for everyone to use Koordinator to co-locate different workloads, we defined ClusterColocationProfile
to help gray workloads use co-location resources. A ClusterColocationProfile
is CRD like the one below. Please do edit each parameter to fit your own use cases.
apiVersion: config.koordinator.sh/v1alpha1
kind: ClusterColocationProfile
metadata:
name: colocation-profile-example
spec:
namespaceSelector:
matchLabels:
koordinator.sh/enable-colocation: "true"
selector:
matchLabels:
sparkoperator.k8s.io/launched-by-spark-operator: "true"
qosClass: BE
priorityClassName: koord-batch
koordinatorPriority: 1000
schedulerName: koord-scheduler
labels:
koordinator.sh/mutated: "true"
annotations:
koordinator.sh/intercepted: "true"
patch:
spec:
terminationGracePeriodSeconds: 30
Various Koordinator components ensure scheduling and runtime quality through labels koordinator.sh/qosClass
, koordinator.sh/priority
and kubernetes native priority.
With the webhook mutating mechanism provided by Kubernetes, koord-manager will modify Pod resource requirements to co-located resources, and inject the QoS and Priority defined by Koordinator into Pod.
Taking the above Profile as an example, when the Spark Operator creates a new Pod in the namespace with the koordinator.sh/enable-colocation=true
label, the Koordinator QoS label koordinator.sh/qosClass
will be injected into the Pod. According to the Profile definition PriorityClassName, modify the Pod's PriorityClassName and the corresponding Priority value. Users can also set the Koordinator Priority according to their needs to achieve more fine-grained priority management, so the Koordinator Priority label koordinator.sh/priority
is also injected into the Pod. Koordinator provides an enhanced scheduler koord-scheduler, so you need to modify the Pod's scheduler name koord-scheduler through Profile.
If you expect to integrate Koordinator into your own system, please learn more about the core concepts.
CPU Suppress
In order to ensure the runtime quality of different workloads in co-located scenarios, Koordinator uses the CPU Suppress mechanism provided by koordlet on the node side to suppress workloads of the Best Effort type when the load increases. Or increase the resource quota for Best Effort type workloads when the load decreases.
When installing through the helm chart, the ConfigMap slo-controller-config
will be created in the koordinator-system namespace, and the CPU Suppress mechanism is enabled by default. If it needs to be closed, refer to the configuration below, and modify the configuration of the resource-threshold-config section to take effect.
apiVersion: v1
kind: ConfigMap
metadata:
name: slo-controller-config
namespace: {{ .Values.installation.namespace }}
data:
...
resource-threshold-config: |
{
"clusterStrategy": {
"enable": false
}
}
Colocation Resources Balance
Koordinator currently adopts a strategy for node co-location resource scheduling, which prioritizes scheduling to machines with more resources remaining in co-location to avoid Best Effort workloads crowding together. More rich scheduling capabilities are on the way.