Back-end configuration for clusters and batch systems #106

benkrikler · 2019-11-08T05:29:28Z

We will likely need a way to configure back-ends for the cluster being run on:

@benkrikler Given that all clusters are a bit different and you have to tweak settings, you'll probably need some configuration boilerplate in your yaml file where the workflow is defined.

Perhaps something like --mode coffea:local and --mode coffea:cluster and if it's coffea cluster it looks for a cluster config in the yaml file and sets up the right call to coffea.

Originally posted by @lgray in #88 (comment)

Relates to #55, although the original scope of that was smaller.

The text was updated successfully, but these errors were encountered:

benkrikler · 2019-11-08T05:36:24Z

I've been thinking about this too. I can think of two ways to support this:

Total user flexibility by having some cluster configuration mechansim, be that a python module / YAML config and /or included in the processing config or in a new file, or
A built-in configuration system which identifies which cluster it is on and uses that configuration.

Option 1 is more general, flexible, and less "clever" so less chance for strange, unexpected bugs than in 2. Option 2 however reduces both the amount of code / config a user should write so should give them a nicer package, and it also increases how much code is shared between users on the same site.

I think we can probably try to do both 1 and 2: we build a mechansim to configure sites on a user-by-user basis, but then default to fill this "automatically" based on some cluster discovery service...

One thing I don't want to do, however, is mix the description of the cluster with the description of the analysis itself (as interpreted by fast-flow). This will mean a third config file to be passed in which will need to be parsed before the others in order to provide correct back-ends to those.

benkrikler · 2020-08-07T01:13:42Z

As of PR #129 we now have an additional command-line option to provide a config file to configure the backend. The exact format of this file is left up to the backend that has been selected, currently chosen using the --mode option. With this in place, I think we have enough to close this off for now, although this is probably something we will be returning to in the future as we explore what is needed from this config, what is standard etc.

benkrikler added enhancement New feature or request Backend Relates to a processing system labels Nov 8, 2019

benkrikler closed this as completed Aug 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Back-end configuration for clusters and batch systems #106

Back-end configuration for clusters and batch systems #106

benkrikler commented Nov 8, 2019

benkrikler commented Nov 8, 2019

benkrikler commented Aug 7, 2020

Back-end configuration for clusters and batch systems #106

Back-end configuration for clusters and batch systems #106

Comments

benkrikler commented Nov 8, 2019

benkrikler commented Nov 8, 2019

benkrikler commented Aug 7, 2020