Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating default value for kube-api-qps, kube-api-burst, apply-concurrency and wait-concurrency for kapp #1410

Closed

Conversation

prembhaskal
Copy link
Member

@prembhaskal prembhaskal commented Nov 22, 2023

What this PR does / why we need it:

This PR is changing the default values kapp parameters when invoked from kapp-controller. Below table proposed the default values for kapp and proposed values for bigger packages.

-- Current Default Proposed Default Values for large Package
kube-api-qps 1000 50 200
kube-api-burst 1000 100 400
apply-concurrency 5 10 15
wait-concurrency 5 10 15

Reasoning behind changes:

  • for kube-api-qps and kube-api-burst

    • Kapp client is used in kapp-controller for App reconciliation. kapp-controller has by default App CR concurrency of 10. so total api-qps is 10k (1000 qps x 10 App CR concurrency) and burst-qps is 10k. This high qps can overload apiserver and cause slowness in cluster (not just for kapp-controller but other user of apiserver in cluster).
    • With lower default values of 50 for qps and 100 for burst, we limit the total api-qps to 500 and 1000 when AppCR concurrency = 10.
    • Also for any small to medium App (having less than 100 resources), the api-qps of 50 and burst of 100 are sufficient and don't cause slowness in reconcile time.
    • Any large Package/App (having more than 100 resources), the default values can be changed in the Package itself.
  • for apply concurrency and wait concurrency

    • With increased concurrency to 10, reconcile times will be reduced.
    • With a large app, this effect is clearly seen (see below results).
    • With a small to medium app, the effect is not seen much.
  • Test results with different apps.

    • Test with varying qps
QPS vs Reconcile time qps=5 burst=5 qps=50 burst=100 qps=1000 burst=1000
500 inst of simple App with 1 job sleeping for 30s 32min 30min 30s 30mins 15s
500 inst of kapp-controller deployed on other cluster 10min 45s 9min 45s 9min 45s
1 inst of App having 500 namespace, configmap,secrets,roles, rolebind,serviceaccount -- 6min 30s 2min
  • Test with varying qps and concurrency

  • Test with kapp-controller app deployed in other cluster.

App: kapp-controller 500 instances    
pkgs in global ns 600 500 + 100    
         
apply concurrency wait concurrency kube-api-qps kube-burst-qps reconcile time(avg)
5 5 1000 1000 9mins
5 5 50 100 9mins
10 10 50 100 9mins 15s
10 10 1000 1000 9mins
  • Test with kube-prometheus-stack helm chart deployed as an app (medium sized app)
App kube-prometheus-stack              
                   
apply concurrency wait concurrency kube-api-qps kube-burst-qps reconcile time(avg)   run1 run2 run3 run4
5 5 1000 1000 29.5s   28s 33s 24s 33s
10 10 50 100 30.7s   24s 34s 36s 29s
10 10 100 200 25.2s   23s 33s 23s 22s
10 10 1000 1000 26.0s   32s 21s 30s 21s
  • Test with large app deploying 4k resources
App kapp-meta-app      
App contents 500 nos of each - ns, sa, role, rolebind, 3 secrets, 1 configmap around 4k resources in total    
         
apply concurrency wait concurrency kube-api-qps kube-burst-qps reconcile time (avg)
5(Default) 5(default) 1000(default) 1000 (default) 2 mins
5(Default) 5(default) 50 100 6 min 30s
10 10 50 100 6min 40s
10 10 1000(default) 1000 (default) 1min 30s
10 10 100 200 3 mins 10s
10 10 200 400 1 min 30s

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?

None

Additional Notes for your reviewer:

Review Checklist:
  • Follows the developer guidelines
  • Relevant tests are added or updated
  • Relevant docs in this repo added or updated
  • Relevant carvel.dev docs added or updated in a separate PR and there's
    a link to that PR
  • Code is at least as readable and maintainable as it was before this
    change

Additional documentation e.g., Proposal, usage docs, etc.:


Signed-off-by: Premkumar Bhaskal <[email protected]>
Signed-off-by: Premkumar Bhaskal <[email protected]>
@prembhaskal prembhaskal marked this pull request as ready for review November 23, 2023 04:16
@prembhaskal prembhaskal changed the title Updating default value for kube-api-qps and kube-api-burst for kapp Updating default value for kube-api-qps, kube-api-burst, apply-concurrency and wait-concurrency for kapp Nov 27, 2023
@prembhaskal prembhaskal marked this pull request as draft December 19, 2023 08:28
@prembhaskal
Copy link
Member Author

as discussed with @praveenrewar this is not needed right now because most of the apps are small and not hitting the qps limit as such. Will raise a separate PR for reducing apply timeout .

@prembhaskal prembhaskal closed this Jan 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

1 participant