Distributed Workloads

Examples

Fine-Tune Llama 2 Models with Ray and DeepSpeed on OpenShift AI

Integration Tests

Prerequisites

Admin access to an OpenShift cluster (CRC is fine)
Installed OpenDataHub or RHOAI, enabled all Distributed Workload components
Installed Go 1.21

Common environment variables

CODEFLARE_TEST_OUTPUT_DIR - Output directory for test logs
CODEFLARE_TEST_TIMEOUT_SHORT - Timeout duration for short tasks
CODEFLARE_TEST_TIMEOUT_MEDIUM - Timeout duration for medium tasks
CODEFLARE_TEST_TIMEOUT_LONG - Timeout duration for long tasks
CODEFLARE_TEST_RAY_IMAGE (Optional) - Ray image used for raycluster configuration

NOTE: quay.io/rhoai/ray:2.23.0-py39-cu121 is the default community image used for creating a raycluster resource. If you have your own custom ray image which suits your purposes, specify it in CODEFLARE_TEST_RAY_IMAGE environment variable.

Environment variables for Training operator test suite

FMS_HF_TUNING_IMAGE - Image tag used in PyTorchJob CR for model training

Environment variables for ODH integration test suite

ODH_NAMESPACE - Namespace where ODH components are installed to
NOTEBOOK_USER_NAME - Username of user used for running Workbench
NOTEBOOK_USER_TOKEN - Login token of user used for running Workbench
NOTEBOOK_IMAGE - Image used for running Workbench

To download MNIST training script datasets from S3 compatible storage, use the environment variables mentioned below :

AWS_DEFAULT_ENDPOINT - Storage bucket endpoint from which to download MNIST datasets
AWS_ACCESS_KEY_ID - Storage bucket access key
AWS_SECRET_ACCESS_KEY - Storage bucket secret key
AWS_STORAGE_BUCKET - Storage bucket name
AWS_STORAGE_BUCKET_MNIST_DIR - Storage bucket directory from which to download MNIST datasets.

Running Tests

Execute tests like standard Go unit tests.

go test -timeout 60m ./tests/kfto/

Name		Name	Last commit message	Last commit date
Latest commit History 193 Commits
.github/workflows		.github/workflows
datasets		datasets
examples		examples
images		images
tests		tests
.gitignore		.gitignore
.snyk		.snyk
LICENSE		LICENSE
Makefile		Makefile
OWNERS		OWNERS
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed Workloads

Examples

Integration Tests

Prerequisites

Common environment variables

Environment variables for Training operator test suite

Environment variables for ODH integration test suite

Running Tests

About

Releases

Packages

Languages

License

ChristianZaccaria/distributed-workloads

Folders and files

Latest commit

History

Repository files navigation

Distributed Workloads

Examples

Integration Tests

Prerequisites

Common environment variables

Environment variables for Training operator test suite

Environment variables for ODH integration test suite

Running Tests

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages