You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Widely adopted Azure Functions (Ex StartStop VM V2), which are set to run on a timer are being deployed to customer subscriptions with the default timer values. When a large enough number of customers have adopted the function and they execute at the same time based on the default value for the timer it is enough traffic to overwhelm services being called by that function.
We need a way via timer configuration to add a jitter value which would allow for smoothing the execution time of functions with the same timer settings preventing large call spikes to downstream services.
Repro steps
Provide the steps required to reproduce the problem
Have a function app scheduled to execute at the same time across a large number of customer subscriptions say 0 0 0 * * *
The function execution happens simultaneously across all customer subscriptions at the same time of midnight (in case of 0 0 0 * * *)
Expected behavior
With a jitter value added to the timer config, the calls would be semi-randomly spread throughout the jitter window across all subscriptions and functions. This would cause the calls to downstream services to be spread out and better able to handle load as adoption increases.
Actual behavior
Execution happens simultaneously across all customer subscriptions at the same time of midnight (in case of 0 0 0 * * *) causing brown out or black out of downstream services.
Known workarounds
Take the machine name which should be unique at execution time
Hash the machine name to get an int value
Set execution minute for daily as %1440 and 6hrs as %360
Now with the function checking every minute we simply take the current minute of the day (0-1440) and compare with execution minute. If they are equal execute otherwise no-op.
This algorithm should be consistent as the function machine names when run every minute should be consistent and rarely change. It will also spread the load throughout the day. To further help we can add small random jitter via a Sleep to spread the load throughout the minute as well.
Related information
N/A
Package version
Links to source
The text was updated successfully, but these errors were encountered:
Widely adopted Azure Functions (Ex StartStop VM V2), which are set to run on a timer are being deployed to customer subscriptions with the default timer values. When a large enough number of customers have adopted the function and they execute at the same time based on the default value for the timer it is enough traffic to overwhelm services being called by that function.
We need a way via timer configuration to add a jitter value which would allow for smoothing the execution time of functions with the same timer settings preventing large call spikes to downstream services.
Repro steps
Provide the steps required to reproduce the problem
Have a function app scheduled to execute at the same time across a large number of customer subscriptions say 0 0 0 * * *
The function execution happens simultaneously across all customer subscriptions at the same time of midnight (in case of 0 0 0 * * *)
Expected behavior
With a jitter value added to the timer config, the calls would be semi-randomly spread throughout the jitter window across all subscriptions and functions. This would cause the calls to downstream services to be spread out and better able to handle load as adoption increases.
Actual behavior
Execution happens simultaneously across all customer subscriptions at the same time of midnight (in case of 0 0 0 * * *) causing brown out or black out of downstream services.
Known workarounds
This algorithm should be consistent as the function machine names when run every minute should be consistent and rarely change. It will also spread the load throughout the day. To further help we can add small random jitter via a Sleep to spread the load throughout the minute as well.
Related information
N/A
The text was updated successfully, but these errors were encountered: