Skip to content

Commit

Permalink
Merge release v0.1.8
Browse files Browse the repository at this point in the history
Release v0.1.8
  • Loading branch information
roehrich-hpe authored Sep 16, 2024
2 parents d7bdba4 + 135112a commit f3d7b7a
Show file tree
Hide file tree
Showing 2 changed files with 41 additions and 84 deletions.
123 changes: 40 additions & 83 deletions docs/guides/data-movement/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,115 +3,72 @@ authors: Blake Devcich <[email protected]>
categories: provisioning
---

# Data Movement Configuration
# Data Movement Overview

## Configuration

Data Movement can be configured in multiple ways:

1. Server side
1. Server side (`NnfDataMovementProfile`)
2. Per Copy Offload API Request arguments

The first method is a "global" configuration - it affects all data movement operations. The second
is done per the Copy Offload API, which allows for some configuration on a per-case basis, but is
limited in scope. Both methods are meant to work in tandem.

## Server Side ConfigMap

The server side configuration is done via the `nnf-dm-config` config map:

```bash
kubectl -n nnf-dm-system get configmap nnf-dm-config
```

The config map allows you to configure the following:

|Setting|Description|
|-------|-----------|
|slots|The number of slots specified in the MPI hostfile. A value less than 1 disables the use of slots in the hostfile.|
|maxSlots|The number of max_slots specified in the MPI hostfile. A value less than 1 disables the use of max_slots in the hostfile.|
|command|The full command to execute data movement. More detail in the following section.|
|progressIntervalSeconds|interval to collect the progress data from the `dcp` command.|

### `command`

The full data movement `command` can be set here. By default, Data Movement uses `mpirun` to run
`dcp` to perform the data movement. Changing the `command` is useful for tweaking `mpirun` or `dcp` options or to
replace the command with something that can aid in debugging (e.g. `hostname`).

`mpirun` uses hostfiles to list the hosts to launch `dcp` on. This hostfile is created for each Data
Movement operation, and it uses the config map to set the `slots` and `maxSlots` for each host (i.e. NNF
node) in the hostfile. The number of `slots`/`maxSlots` is the same for every host in the hostfile.

Additionally, Data Movement uses substitution to fill in dynamic information for each Data Movement
operation. Each of these **must** be present in the command for Data Movement to work properly when
using `mpirun` and `dcp`:
The first method is a "global" configuration - it affects all data movement operations that use a
particular `NnfDataMovementProfile` (or the default). The second is done per the Copy Offload API,
which allows for some configuration on a per-case basis, but is limited in scope. Both methods are
meant to work in tandem.

|VAR|Description|
|---|-----------|
|`$HOSTFILE`|hostfile that is created and used for mpirun.|
|`$UID`|User ID that is inherited from the Workflow.|
|`$GID`|Group ID that is inherited from the Workflow.|
|`$SRC`|source for the data movement.|
|`$DEST`|destination for the data movement.|
### Data Movement Profiles

By default, the command will look something like the following. Please see the config map itself for
the most up to date default command:
The server side configuration is controlled by creating `NnfDataMovementProfiles` resources in
Kubernetes. These work similar to `NnfStorageProfiles`. See [here](../storage-profiles/readme.md)
for understanding how to use profiles, set a default, etc.

```bash
mpirun --allow-run-as-root --hostfile $HOSTFILE dcp --progress 1 --uid $UID --gid $GID $SRC $DEST
```
For an in-depth understanding of the capabilities offered by Data Movement profiles, we recommend
referring to the following resources:

### Profiles
- [Type definition](https://github.com/NearNodeFlash/nnf-sos/blob/master/api/v1alpha1/nnfdatamovementprofile_types.go#L27) for `NnfDataMovementProfile`
- [Sample](https://github.com/NearNodeFlash/nnf-sos/blob/master/config/samples/nnf_v1alpha1_nnfdatamovementprofile.yaml) for `NnfDataMovementProfile`
- [Online Examples](https://github.com/NearNodeFlash/nnf-sos/blob/master/config/examples/nnf_v1alpha1_nnfdatamovementprofile.yaml) for `NnfDataMovementProfile`

Profiles can be specified in the in the `nnf-dm-config` config map. Users are able to select a
profile using #DW directives (e.g .`copy_in profile=my-dm-profile`) and the Copy Offload API. If no
profile is specified, the `default` profile is used. This default profile must exist in the config
map.

`slots`, `maxSlots`, and `command` can be stored in Data Movement profiles. These profiles are
available to quickly switch between different settings for a particular workflow.

Example profiles:

```yaml
profiles:
default:
slots: 8
maxSlots: 0
command: mpirun --allow-run-as-root --hostfile $HOSTFILE dcp --progress 1 --uid $UID --gid $GID $SRC $DEST
no-xattrs:
slots: 8
maxSlots: 0
command: mpirun --allow-run-as-root --hostfile $HOSTFILE dcp --progress 1 --xattrs none --uid $UID --gid $GID $SRC $DEST
```
## Copy Offload API Daemon
### Copy Offload API Daemon

The `CreateRequest` API call that is used to create Data Movement with the Copy Offload API has some
options to allow a user to specify some options for that particular Data Movement. These settings
are on a per-request basis.

The Copy Offload API requires the `nnf-dm` daemon to be running on the compute node. This daemon may be configured to run full-time, or it may be left in a disabled state if the WLM is expected to run it only when a user requests it. See [Compute Daemons](../compute-daemons/readme.md) for the systemd service configuration of the daemon. See `RequiredDaemons` in [Directive Breakdown](../directive-breakdown/readme.md) for a description of how the user may request the daemon, in the case where the WLM will run it only on demand.
options to allow a user to specify some options for that particular Data Movement operation. These
settings are on a per-request basis. These supplement the configuration in the
`NnfDataMovementProfile`.

If the WLM is running the `nnf-dm` daemon only on demand, then the user can request that the daemon be running for their job by specifying `requires=copy-offload` in their `DW` directive. The following is an example:

```bash
#DW jobdw type=xfs capacity=1GB name=stg1 requires=copy-offload
```
The Copy Offload API requires the `nnf-dm` daemon to be running on the compute node. This daemon may
be configured to run full-time, or it may be left in a disabled state if the WLM is expected to run
it only when a user requests it. See [Compute Daemons](../compute-daemons/readme.md) for the systemd
service configuration of the daemon. See `RequiredDaemons` in [Directive
Breakdown](../directive-breakdown/readme.md) for a description of how the user may request the
daemon in the case where the WLM will run it only on demand.

See the [DataMovementCreateRequest API](copy-offload-api.html#datamovement.DataMovementCreateRequest)
definition for what can be configured.

## SELinux and Data Movement
### SELinux and Data Movement

Careful consideration must be taken when enabling SELinux on compute nodes. Doing so will result in
SELinux Extended File Attributes (xattrs) being placed on files created by applications running on
the compute node, which may not be supported by the destination file system (e.g. Lustre).

Depending on the configuration of `dcp`, there may be an attempt to copy these xattrs. You may need
to disable this by using `dcp --xattrs none` to avoid errors. For example, the `command` in the
`nnf-dm-config` config map or `dcpOptions` in the [DataMovementCreateRequest
`NnfDataMovementProfile` or `dcpOptions` in the [DataMovementCreateRequest
API](copy-offload-api.html#datamovement.DataMovementCreateRequest) could be used to set this
option.

See the [`dcp` documentation](https://mpifileutils.readthedocs.io/en/latest/dcp.1.html) for more
information.

### `sshd` Configuration for Data Movement Workers

The `nnf-dm-worker-*` pods run `sshd` in order to listen for `mpirun` jobs to perform data movement.
The number of simultaneous connections is limited via the sshd configuration (i.e. `MaxStartups`).
**If you see error messages in Data Movement where mpirun cannot communicate with target nodes,
and you have ruled out any networking issues, this may be due to sshd configuration.** `sshd` still
start rejecting connections once the limit is reached.

The `sshd_config` is stored in the `nnf-dm-worker-config` `ConfigMap` so that it can be changed on
a running system without needing to roll new images. This also enables site-specific configuration.
2 changes: 1 addition & 1 deletion external/nnf-dm
Submodule nnf-dm updated 70 files
+17 −2 PROJECT
+9 −9 cmd/main.go
+2 −1 config/manager/kustomization.yaml
+9 −1 config/manager/manager.yaml
+1 −1 config/manager/manager_imagepullsecret_patch.yaml
+17 −0 config/manager/worker-sshd-config.yaml
+10 −0 crd-bumper.yaml
+43 −43 daemons/compute/server/servers/server_default.go
+4 −4 go.mod
+8 −8 go.sum
+49 −49 internal/controller/datamovement_controller.go
+86 −87 internal/controller/datamovement_controller_test.go
+17 −17 internal/controller/datamovementmanager_controller.go
+8 −8 internal/controller/datamovementmanager_controller_test.go
+3 −3 internal/controller/suite_test.go
+11 −0 vendor/github.com/DataWorkflowServices/dws/api/v1alpha2/storage_types.go
+3 −0 vendor/github.com/DataWorkflowServices/dws/api/v1alpha2/systemconfiguration_types.go
+2 −1 vendor/github.com/DataWorkflowServices/dws/api/v1alpha2/zz_generated.deepcopy.go
+51 −0 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/conversion.go
+4 −4 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/groupversion_info.go
+20 −1 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnf_resource_condition_types.go
+2 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnf_resource_health_type.go
+2 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnf_resource_state_type.go
+2 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnf_resource_status_type.go
+2 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnf_resource_type.go
+3 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfaccess_types.go
+37 −0 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfaccess_webhook.go
+4 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfcontainerprofile_types.go
+2 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfcontainerprofile_webhook.go
+3 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfdatamovement_types.go
+37 −0 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfdatamovement_webhook.go
+3 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfdatamovementmanager_types.go
+37 −0 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfdatamovementmanager_webhook.go
+3 −1 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfdatamovementprofile_types.go
+2 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfdatamovementprofile_webhook.go
+2 −1 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnflustremgt_types.go
+37 −0 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnflustremgt_webhook.go
+3 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfnode_types.go
+37 −0 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfnode_webhook.go
+3 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfnodeblockstorage_types.go
+37 −0 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfnodeblockstorage_webhook.go
+3 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfnodeecdata_types.go
+37 −0 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfnodeecdata_webhook.go
+4 −3 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfnodestorage_types.go
+37 −0 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfnodestorage_webhook.go
+3 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfportmanager_types.go
+37 −0 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfportmanager_webhook.go
+4 −3 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfstorage_types.go
+37 −0 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfstorage_webhook.go
+4 −2 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfstorageprofile_types.go
+2 −34 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfstorageprofile_webhook.go
+9 −3 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfsystemstorage_types.go
+37 −0 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/nnfsystemstorage_webhook.go
+1 −1 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/workflow_helpers.go
+1 −23 vendor/github.com/NearNodeFlash/nnf-sos/api/v1alpha2/zz_generated.deepcopy.go
+250 −0 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnfaccesses.yaml
+14,845 −0 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnfcontainerprofiles.yaml
+7,378 −0 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnfdatamovementmanagers.yaml
+126 −0 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnfdatamovementprofiles.yaml
+406 −0 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnfdatamovements.yaml
+271 −0 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnflustremgts.yaml
+163 −0 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnfnodeblockstorages.yaml
+40 −0 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnfnodeecdata.yaml
+160 −0 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnfnodes.yaml
+219 −1 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnfnodestorages.yaml
+237 −0 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnfportmanagers.yaml
+581 −0 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnfstorageprofiles.yaml
+295 −1 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnfstorages.yaml
+240 −2 vendor/github.com/NearNodeFlash/nnf-sos/config/crd/bases/nnf.cray.hpe.com_nnfsystemstorages.yaml
+5 −5 vendor/modules.txt

0 comments on commit f3d7b7a

Please sign in to comment.