You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What this gives us is 3 OSTs on elcap1 and 1 on elcap2. However, as @behlendorf noted,
For Lustre we realistically wouldn't want to ever create more than one OST or MDT per rabbit per workflow. It's good that HPEs software supports it since it'd be nice to experiment with, but it would be an odd configuration.
However I think there is a disconnect between the way Flux allocates storage and the the way Servers asks for the storage to be represented. At the moment Flux does not have any kind of policy to allocate equal amounts of storage from each rabbit. Flux may allocate a huge chunk of storage (let's say N bytes) from elcap1 and a much smaller amount of storage (M bytes) on elcap2 (as in the example above), with the desire of nevertheless having a single OST (and perhaps MDT) on each despite the size differences. But there is no good way for us to represent that in Servers without doing something like the above, in which we take the greatest common divisor of N and M, make that the allocationSize, and then multiply the allocationCount for each by N / GCD(N, M) and M / GCD(N,M) respectively.
@behlendorf also noted that imbalanced allocations may not be desirable, since
Lustre will do a better job of balancing their usage if [OSTs are] all close to the same capacity
So Flux may need to work on a policy to equalize the amount of storage on each rabbit node. However it might be nice if we could do something like the following:
Brian has indicated to me that OSTs of unequal sizes may be devastating to performance. So perhaps the problem is really in Flux's scheduling policy. I've opened flux-framework/flux-coral2#175.
Servers resources for lustre, when filled in by Flux, can look something like this:
What this gives us is 3 OSTs on
elcap1
and 1 onelcap2
. However, as @behlendorf noted,However I think there is a disconnect between the way Flux allocates storage and the the way
Servers
asks for the storage to be represented. At the moment Flux does not have any kind of policy to allocate equal amounts of storage from each rabbit. Flux may allocate a huge chunk of storage (let's say N bytes) fromelcap1
and a much smaller amount of storage (M bytes) onelcap2
(as in the example above), with the desire of nevertheless having a single OST (and perhaps MDT) on each despite the size differences. But there is no good way for us to represent that inServers
without doing something like the above, in which we take the greatest common divisor of N and M, make that theallocationSize
, and then multiply theallocationCount
for each by N / GCD(N, M) and M / GCD(N,M) respectively.@behlendorf also noted that imbalanced allocations may not be desirable, since
So Flux may need to work on a policy to equalize the amount of storage on each rabbit node. However it might be nice if we could do something like the following:
The text was updated successfully, but these errors were encountered: