Skip to content

Thoughts on concurrency

Andreas Frisch edited this page May 6, 2013 · 4 revisions

Multiple build-slaves

Jenkins support multiple build-slaves which - when doing pre-tested commits - allows for some concurrency issues. As each build merges a given changeset with the current company truth, the builds are dependant on previous builds, as these update the company truth if successful. However, when starting multiple builds on multiple slaves, some of these builds will use a potentially out-dated company truth, as their sibling jobs may or may not update the company truth on completion.

This concurrency problem is only relevant when a previous build succeeds, though. Thus, if the company truth has been updated since build-start of a given job, we need to re-merge and re-build, effectively wasting computing power for the previous build. Reversely, if the company truth is unchanged, we can push as expected. This means that if previous concurrent builds fail, we can safely ignore those as expected.

This check for updates on company truth can be done whether or not we use multiple build slaves, thus trivially integrating with existing Jenkins setup. However, using this functionality should be optional, as wasting computing power on a load-heavy machine is silly when (and if) we expect most builds to succeed. On the other hand, if we expect multiple errors and conflicts and only care about speed-ups and not computing power consumption, this option is highly desirable.

As such, handling multiple slaves should be a (de)selectable option in the general settings of this plugin. However, this is a special use-case which is not imperative to handle at this point in development.

Small push-interval

When a user (or multiple users) pushes multiple times within a short time interval, Jenkins only registers one (or a few) of these, as Jenkins apparently disregards new build orders during its quiet-period. This quiet-period is a Jenkins thing, not a PRTECO-thing. However, while this problem exists in classic Jenkins we cannot simply ignore it, as the PRTECO-workflow is markedly different.

If we cannot overrule the quiet-period, we must notify users when their tasks have been canceled.

We must test for the least window of opportunity for posting multiple jobs. This must be done automatically, as we need to know the excact amount of request-down-time following a given task. Depending on the size of this window, we must decide whether it is worth looking further into.

Potentially we can delay the build-order from changegrouphook.py by having a small queue, seperating the incoming request on JR by a time amount slightly larger than the found window.