diff --git a/README.md b/README.md index a523d5dcc8..ecab20bd54 100755 --- a/README.md +++ b/README.md @@ -17,13 +17,14 @@ ### About -Collective Mind (CM) is a [community project](CONTRIBUTING.md) to develop +**Collective Mind (CM)** is a [community project](https://github.com/mlcommons/ck/blob/master/CONTRIBUTING.md) to develop a [collection of portable, extensible, technology-agnostic and ready-to-use automation recipes with a human-friendly interface (aka CM scripts)](https://github.com/mlcommons/ck/tree/master/docs/list_of_scripts.md) that automate all the manual steps required to build, run, benchmark and optimize complex ML/AI applications on any platform with any software and hardware. -CM scripts are being developed based on the feedback from [MLCommons engineers and researchers](docs/taskforce.md) +CM scripts are being developed based on the feedback from +[MLCommons engineers and researchers](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md) to help them assemble, run, benchmark and optimize complex AI/ML applications across diverse and continuously changing models, data sets, software and hardware from Nvidia, Intel, AMD, Google, Qualcomm, Amazon and other vendors. @@ -31,12 +32,18 @@ They require Python 3.7+ with minimal dependencies and can run natively on Ubunt and any other operating system, in a cloud or inside automatically generated containers. Some key requirements for the CM design are: -* must be non-intrusive and easy to debug, require zero changes to existing projects and must complement, reuse, wrap and interconnect all existing automation scripts and tools (such as cmake, ML workflows, python poetry and containers) rather than substituting them; +* must be non-intrusive and easy to debug, require zero changes to existing projects and must complement, + reuse, wrap and interconnect all existing automation scripts and tools (such as cmake, ML workflows, + python poetry and containers) rather than substituting them; * must have a very simple and human-friendly command line with a Python API and minimal dependencies; -* must require minimal or zero learning curve by using plain Python, native scripts, environment variables and simple JSON/YAML descriptions instead of inventing new languages; -* must run in a native environment with Ubuntu, Debian, RHEL, Amazon Linux, MacOS, Windows and any other operating system while automatically generating container snapshots with CM recipes for repeatability and reproducibility; - -Below you can find a few examples of this collaborative engineering effort sponsored by [MLCommons (non-profit organization with 125+ organizations)](https://mlcommons.org) - +* must require minimal or zero learning curve by using plain Python, native scripts, environment variables + and simple JSON/YAML descriptions instead of inventing new languages; +* must run in a native environment with Ubuntu, Debian, RHEL, Amazon Linux, MacOS, Windows + and any other operating system while automatically generating container snapshots + with CM recipes for repeatability and reproducibility. + +Below you can find a few examples of this collaborative engineering effort sponsored +by [MLCommons (non-profit organization with 125+ organizations)](https://mlcommons.org) - a few most-commonly used [automation recipes](https://github.com/mlcommons/ck/tree/master/docs/list_of_scripts.md) that can be chained into more complex automation workflows [using simple JSON or YAML](https://github.com/mlcommons/ck/blob/master/cm-mlops/script/app-image-classification-onnx-py/_cm.yaml). @@ -167,7 +174,7 @@ to modularize, run and benchmark other software projects and make it easier to rerun, reproduce and reuse [research projects from published papers at Systems and ML conferences]( https://cTuning.org/ae/micro2023.html ). -Please check the [**Getting Started Guide**](docs/getting-started.md) +Please check the [**Getting Started Guide**](https://github.com/mlcommons/ck/blob/master/docs/getting-started.md) to understand how CM automation recipes work, how to use them to automate your own projects, and how to implement and share new automations in your public or private projects. @@ -185,7 +192,7 @@ and how to implement and share new automations in your public or private project * ACM REP'23 keynote about MLCommons CM: [slides](https://doi.org/10.5281/zenodo.8105339) * ACM TechTalk'21 about automating research projects: [YouTube](https://www.youtube.com/watch?v=7zpeIVwICa4) -* MLPerf inference submitter orientation: [v3.1 slides](https://doi.org/10.5281/zenodo.10605079), [v3.0 slides](https://doi.org/10.5281/zenodo.8144274) +* MLPerf inference submitter orientation: [v4.0 slides](https://doi.org/10.5281/zenodo.10605079), [v3.1 slides](https://doi.org/10.5281/zenodo.8144274) ### Get in touch diff --git a/cm-mlops/automation/script/_cm.json b/cm-mlops/automation/script/_cm.json index 023a8b2bbf..140662bfa1 100644 --- a/cm-mlops/automation/script/_cm.json +++ b/cm-mlops/automation/script/_cm.json @@ -7,7 +7,7 @@ }, "desc": "Making native scripts more portable, interoperable and deterministic", "developers": "[Arjun Suresh](https://www.linkedin.com/in/arjunsuresh), [Grigori Fursin](https://cKnowledge.org/gfursin)", - "actions_with_help":["run"], + "actions_with_help":["run", "docker"], "sort": 1000, "tags": [ "automation" diff --git a/cm-mlops/automation/script/module.py b/cm-mlops/automation/script/module.py index fc1e13d59f..2ae9222d9e 100644 --- a/cm-mlops/automation/script/module.py +++ b/cm-mlops/automation/script/module.py @@ -652,46 +652,7 @@ def run(self, i): # Check if has --help if i.get('help',False): - print ('') - print ('Help for this CM script (automation recipe):') - - variations = meta.get('variations',{}) - if len(variations)>0: - print ('') - print ('Available variations:') - print ('') - for v in sorted(variations): - print (' _'+v) - - input_mapping = meta.get('input_mapping', {}) - if len(input_mapping)>0: - print ('') - print ('Available flags mapped to environment variables:') - print ('') - for k in sorted(input_mapping): - v = input_mapping[k] - - print (' --{} -> --env.{}'.format(k,v)) - - input_description = meta.get('input_description', {}) - if len(input_description)>0: - print ('') - print ('Available flags (Python API dict keys):') - print ('') - for k in sorted(input_description): - v = input_description[k] - n = v.get('desc','') - - x = ' --'+k - if n!='': x+=' ({})'.format(n) - - print (x) - - - print ('') - input ('Press Enter to see common flags for all scripts') - - return {'return':0} + return utils.call_internal_module(self, __file__, 'module_help', 'print_help', {'meta':meta, 'path':path}) deps = meta.get('deps',[]) diff --git a/cm-mlops/automation/script/module_help.py b/cm-mlops/automation/script/module_help.py new file mode 100644 index 0000000000..45e5059975 --- /dev/null +++ b/cm-mlops/automation/script/module_help.py @@ -0,0 +1,51 @@ +import os +from cmind import utils + +# Pring help about script +def print_help(i): + + meta = i['meta'] + path = i['path'] + + print ('') + print ('Help for this CM script ({},{}):'.format(meta.get('alias',''), meta.get('uid',''))) + + print ('') + print ('Path to this automation recipe: {}'.format(path)) + + variations = meta.get('variations',{}) + if len(variations)>0: + print ('') + print ('Available variations:') + print ('') + for v in sorted(variations): + print (' _'+v) + + input_mapping = meta.get('input_mapping', {}) + if len(input_mapping)>0: + print ('') + print ('Available flags mapped to environment variables:') + print ('') + for k in sorted(input_mapping): + v = input_mapping[k] + + print (' --{} -> --env.{}'.format(k,v)) + + input_description = meta.get('input_description', {}) + if len(input_description)>0: + print ('') + print ('Available flags (Python API dict keys):') + print ('') + for k in sorted(input_description): + v = input_description[k] + n = v.get('desc','') + + x = ' --'+k + if n!='': x+=' ({})'.format(n) + + print (x) + + print ('') + input ('Press Enter to see common flags for all scripts') + + return {'return':0} diff --git a/cm-mlops/automation/script/module_misc.py b/cm-mlops/automation/script/module_misc.py index ae6bc5e03b..e345fe2e21 100644 --- a/cm-mlops/automation/script/module_misc.py +++ b/cm-mlops/automation/script/module_misc.py @@ -1553,6 +1553,9 @@ def docker(i): meta = artifact.meta + if i.get('help',False): + return utils.call_internal_module(self_module, __file__, 'module_help', 'print_help', {'meta':meta, 'path':artifact.path}) + script_path = artifact.path tags = meta.get("tags", []) @@ -1691,6 +1694,11 @@ def docker(i): port_maps = i.get('docker_port_maps', docker_settings.get('port_maps', [])) + if detached == '': + detached = docker_settings.get('detached', '') + + if interactive == '': + interactive = docker_settings.get('interactive', '') # # Regenerate run_cmd # if i.get('cmd'): diff --git a/cm-mlops/script/build-mlperf-inference-server-nvidia/_cm.yaml b/cm-mlops/script/build-mlperf-inference-server-nvidia/_cm.yaml index 88f2c9489e..bd485784a7 100644 --- a/cm-mlops/script/build-mlperf-inference-server-nvidia/_cm.yaml +++ b/cm-mlops/script/build-mlperf-inference-server-nvidia/_cm.yaml @@ -77,7 +77,7 @@ deps: # Detect CMake - tags: get,cmake - version_min: "3.18" + version_min: "3.25" # Detect Google Logger - tags: get,generic,sys-util,_glog-dev @@ -106,6 +106,8 @@ deps: names: - nvidia-inference-common-code + - tags: get,generic-python-lib,_package.pybind11 + # Detect pycuda - tags: get,generic-python-lib,_pycuda skip_if_env: @@ -125,6 +127,7 @@ deps: names: - nvidia-scratch-space + post_deps: # Detect nvidia system - tags: add,custom,system,nvidia @@ -185,25 +188,40 @@ versions: add_deps_recursive: nvidia-inference-common-code: version: r2.1 + nvidia-scratch-space: + tags: version.2_1 r3.0: add_deps_recursive: nvidia-inference-common-code: version: r3.0 + nvidia-scratch-space: + tags: version.3_0 + r3.1: add_deps_recursive: nvidia-inference-common-code: version: r3.1 + nvidia-scratch-space: + tags: version.3_1 + deps: + - tags: install,nccl,libs,_cuda + - tags: install,pytorch,from.src,_for-nvidia-mlperf-inference-v3.1-gptj + names: + - pytorch docker: skip_run_cmd: 'no' all_gpus: 'yes' docker_os: ubuntu - docker_real_run: True + docker_real_run: False + interactive: True docker_os_version: '20.04' base_image: nvcr.io/nvidia/mlperf/mlperf-inference:mlpinf-v3.1-cuda12.2-cudnn8.9-x86_64-ubuntu20.04-public docker_input_mapping: imagenet_path: IMAGENET_PATH + gptj_checkpoint_path: GPTJ_CHECKPOINT_PATH + criteo_preprocessed_path: CRITEO_PREPROCESSED_PATH results_dir: RESULTS_DIR submission_dir: SUBMISSION_DIR cudnn_tar_file_path: CM_CUDNN_TAR_FILE_PATH diff --git a/cm-mlops/script/build-mlperf-inference-server-nvidia/run.sh b/cm-mlops/script/build-mlperf-inference-server-nvidia/run.sh index 6a55156a57..e03aaa72b8 100644 --- a/cm-mlops/script/build-mlperf-inference-server-nvidia/run.sh +++ b/cm-mlops/script/build-mlperf-inference-server-nvidia/run.sh @@ -11,6 +11,6 @@ if [[ ${CM_MLPERF_DEVICE} == "inferentia" ]]; then make prebuild fi -make ${CM_MAKE_BUILD_COMMAND} +SKIP_DRIVER_CHECK=1 make ${CM_MAKE_BUILD_COMMAND} test $? -eq 0 || exit $? diff --git a/cm-mlops/script/download-and-extract/_cm.json b/cm-mlops/script/download-and-extract/_cm.json index 4d99379bf6..a8d6cccf20 100644 --- a/cm-mlops/script/download-and-extract/_cm.json +++ b/cm-mlops/script/download-and-extract/_cm.json @@ -96,6 +96,14 @@ "CM_DAE_EXTRACT_DOWNLOADED": "yes" } }, + "rclone": { + "add_deps_recursive": { + "download-script": { + "tags": "_rclone" + } + }, + "group": "download-tool" + }, "gdown": { "add_deps_recursive": { "download-script": { diff --git a/cm-mlops/script/download-file/_cm.json b/cm-mlops/script/download-file/_cm.json index 8f538c2711..56b9d92856 100644 --- a/cm-mlops/script/download-file/_cm.json +++ b/cm-mlops/script/download-file/_cm.json @@ -63,6 +63,17 @@ }, "group": "download-tool" }, + "rclone": { + "deps": [ + { + "tags": "get,rclone" + } + ], + "env": { + "CM_DOWNLOAD_TOOL": "rclone" + }, + "group": "download-tool" + }, "url.#": { "env": { "CM_DOWNLOAD_URL": "#" diff --git a/cm-mlops/script/download-file/customize.py b/cm-mlops/script/download-file/customize.py index f860c47517..9e6812422d 100644 --- a/cm-mlops/script/download-file/customize.py +++ b/cm-mlops/script/download-file/customize.py @@ -65,6 +65,8 @@ def preprocess(i): if j>0: urltail=urltail[:j] env['CM_DOWNLOAD_FILENAME'] = urltail + elif env.get('CM_DOWNLOAD_TOOL', '') == "rclone": + env['CM_DOWNLOAD_FILENAME'] = urltail else: env['CM_DOWNLOAD_FILENAME'] = "index.html" @@ -104,6 +106,11 @@ def preprocess(i): elif tool == "gdown": env['CM_DOWNLOAD_CMD'] = f"gdown {extra_download_options} {url}" + elif tool == "rclone": + if env.get('CM_RCLONE_CONFIG_CMD', '') != '': + env['CM_DOWNLOAD_CONFIG_CMD'] = env['CM_RCLONE_CONFIG_CMD'] + env['CM_DOWNLOAD_CMD'] = f"rclone copy {url} ./{env['CM_DOWNLOAD_FILENAME']} -P" + filename = env['CM_DOWNLOAD_FILENAME'] env['CM_DOWNLOAD_DOWNLOADED_FILENAME'] = filename diff --git a/cm-mlops/script/download-file/run.sh b/cm-mlops/script/download-file/run.sh index 2d105fce2c..ed85f72a01 100644 --- a/cm-mlops/script/download-file/run.sh +++ b/cm-mlops/script/download-file/run.sh @@ -1,5 +1,11 @@ #!/bin/bash +if [[ -n ${CM_DOWNLOAD_CONFIG_CMD} ]]; then + echo "" + echo "${CM_DOWNLOAD_CONFIG_CMD}" + eval "${CM_DOWNLOAD_CONFIG_CMD}" +fi + if [ -e ${CM_DOWNLOAD_DOWNLOADED_PATH} ]; then if [[ "${CM_DOWNLOAD_CHECKSUM_CMD}" != "" ]]; then echo "" diff --git a/cm-mlops/script/get-lib-armnn/_cm.json b/cm-mlops/script/get-lib-armnn/_cm.json index 5f8f6feab8..3fa7dce5bb 100644 --- a/cm-mlops/script/get-lib-armnn/_cm.json +++ b/cm-mlops/script/get-lib-armnn/_cm.json @@ -4,7 +4,7 @@ "automation_uid": "5b4e0237da074764", "cache": true, "category": "Detection or installation of tools and artifacts", - "default_version": "23.05", + "default_version": "23.11", "deps": [ { "tags": "detect,os" @@ -36,6 +36,12 @@ ], "uid": "9603a2e90fd44587", "versions": { + "23.11": { + "env": { + "CM_LIB_ARMNN_VERSION": "v23.11", + "CM_TMP_GIT_BRANCH_NAME": "branches/armnn_23_11" + } + }, "23.05": { "env": { "CM_LIB_ARMNN_VERSION": "v23.05", diff --git a/cm-mlops/script/get-ml-model-gptj/_cm.json b/cm-mlops/script/get-ml-model-gptj/_cm.json index 51f6cf3bb1..acf405c5e1 100644 --- a/cm-mlops/script/get-ml-model-gptj/_cm.json +++ b/cm-mlops/script/get-ml-model-gptj/_cm.json @@ -22,10 +22,10 @@ "CM_EXTRACT_FINAL_ENV_NAME": "GPTJ_CHECKPOINT_PATH", "CM_EXTRACT_TO_FOLDER": "gpt-j" }, - "tags": "download-and-extract,_wget", + "tags": "download-and-extract", "update_tags_from_env_with_prefix": { "_url.": [ - "CM_PACKAGE_URL" + "CM_DOWNLOAD_URL" ] }, "enable_if_env": { @@ -34,7 +34,8 @@ "force_cache": true, "names": [ "dae" - ] + ], + "extra_cache_tags": "gptj,model" } ], "tags": [ @@ -83,7 +84,9 @@ "CM_DOWNLOAD_FILENAME": "checkpoint.zip", "CM_UNZIP": "yes", "CM_DOWNLOAD_CHECKSUM_NOT_USED": "e677e28aaf03da84584bb3073b7ee315", - "CM_PACKAGE_URL": "https://cloud.mlcommons.org/index.php/s/QAZ2oM94MkFtbQx/download" + "CM_PACKAGE_URL": "https://cloud.mlcommons.org/index.php/s/QAZ2oM94MkFtbQx/download", + "CM_RCLONE_CONFIG": "rclone config create mlc-inference s3 provider=LyveCloud access_key_id=0LITLNQMHZALM5AK secret_access_key=YQKYTMBY23TMZHLOYFJKL5CHHS0CWYUC endpoint=s3.us-east-1.lyvecloud.seagate.com", + "CM_RCLONE_URL": "mlc-inference:mlcommons-inference-wg-s3/gpt-j" }, "add_deps_recursive": { "dae": { @@ -174,6 +177,29 @@ "CM_GPTJ_INTEL_MODEL": "yes" } }, + "wget": { + "group": "download-tool", + "default": true, + "add_deps_recursive": { + "dae": { + "tags": "_wget" + } + }, + "env": { + "CM_DOWNLOAD_URL": "<<>>" + } + }, + "rclone": { + "group": "download-tool", + "add_deps_recursive": { + "dae": { + "tags": "_rclone" + } + }, + "env": { + "CM_DOWNLOAD_URL": "<<>>" + } + }, "pytorch,int8,intel": { }, "pytorch,int4,intel": { diff --git a/cm-mlops/script/get-ml-model-huggingface-zoo/_cm.json b/cm-mlops/script/get-ml-model-huggingface-zoo/_cm.json index 45ed6a9f0d..391689ad01 100644 --- a/cm-mlops/script/get-ml-model-huggingface-zoo/_cm.json +++ b/cm-mlops/script/get-ml-model-huggingface-zoo/_cm.json @@ -17,6 +17,7 @@ } ], "input_mapping": { + "download_path": "CM_DOWNLOAD_PATH", "model_filename": "CM_MODEL_ZOO_FILENAME", "env_key": "CM_MODEL_ZOO_ENV_KEY" }, diff --git a/cm-mlops/script/get-ml-model-huggingface-zoo/customize.py b/cm-mlops/script/get-ml-model-huggingface-zoo/customize.py index e673037494..8770e5bcb4 100644 --- a/cm-mlops/script/get-ml-model-huggingface-zoo/customize.py +++ b/cm-mlops/script/get-ml-model-huggingface-zoo/customize.py @@ -13,7 +13,9 @@ def preprocess(i): script_path = i['run_script_input']['path'] - path = os.getcwd() + path = env.get('CM_DOWNLOAD_PATH', '') + if path == '': + path = os.getcwd() if env.get('CM_GIT_CLONE_REPO', '') != 'yes': run_cmd = env.get('CM_PYTHON_BIN_WITH_PATH') + " " + os.path.join(script_path, 'download_model.py') diff --git a/cm-mlops/script/get-ml-model-stable-diffusion/_cm.json b/cm-mlops/script/get-ml-model-stable-diffusion/_cm.json index 91fe961891..0944739b72 100644 --- a/cm-mlops/script/get-ml-model-stable-diffusion/_cm.json +++ b/cm-mlops/script/get-ml-model-stable-diffusion/_cm.json @@ -10,7 +10,8 @@ "CM_ML_MODEL_WEIGHT_TRANSFORMATIONS": "no" }, "input_mapping": { - "checkpoint": "SDXL_CHECKPOINT_PATH" + "checkpoint": "SDXL_CHECKPOINT_PATH", + "download_path": "CM_DOWNLOAD_PATH" }, "new_env_keys": [ "CM_ML_MODEL_*", @@ -21,6 +22,9 @@ "enable_if_env": { "CM_TMP_REQUIRE_DOWNLOAD": [ "yes" + ], + "CM_DOWNLOAD_TOOL": [ + "git" ] }, "env": { @@ -34,6 +38,28 @@ "hf-zoo" ], "tags": "get,ml-model,huggingface,zoo,_clone-repo,_model-stub.stabilityai/stable-diffusion-xl-base-1.0" + }, + { + "env": { + "CM_DOWNLOAD_FINAL_ENV_NAME": "CM_ML_MODEL_PATH" + }, + "tags": "download-and-extract", + "update_tags_from_env_with_prefix": { + "_url.": [ + "CM_DOWNLOAD_URL" + ] + }, + "enable_if_env": { + "CM_TMP_REQUIRE_DOWNLOAD": [ "yes" ], + "CM_DOWNLOAD_TOOL": [ + "rclone" + ] + }, + "force_cache": true, + "extra_cache_tags": "stable-diffusion,sdxl,model", + "names": [ + "dae" + ] } ], "tags": [ @@ -60,6 +86,14 @@ }, "group": "precision" }, + "fp16": { + "env": { + "CM_ML_MODEL_INPUT_DATA_TYPES": "fp16", + "CM_ML_MODEL_PRECISION": "fp16", + "CM_ML_MODEL_WEIGHT_DATA_TYPES": "fp16" + }, + "group": "precision" + }, "int8": { "env": { "CM_ML_MODEL_INPUT_DATA_TYPES": "int8", @@ -87,6 +121,58 @@ "CM_ML_MODEL_WEIGHT_DATA_TYPES": "uint8" }, "group": "precision" + }, + "huggingface": { + "group": "download-source", + "default_variations": { + "download-tool": "git" + } + }, + "mlcommons": { + "group": "download-source", + "default": true, + "default_variations": { + "download-tool": "rclone" + } + }, + "git": { + "group": "download-tool", + "env": { + "CM_DOWNLOAD_TOOL": "git" + } + }, + "wget": { + "group": "download-tool", + "env": { + "CM_DOWNLOAD_TOOL": "wget" + }, + "adr": { + "dae": { + "tags": "_wget" + } + } + }, + "rclone": { + "group": "download-tool", + "env": { + "CM_RCLONE_CONFIG": "rclone config create mlc-inference s3 provider=LyveCloud access_key_id=0LITLNQMHZALM5AK secret_access_key=YQKYTMBY23TMZHLOYFJKL5CHHS0CWYUC endpoint=s3.us-east-1.lyvecloud.seagate.com", + "CM_DOWNLOAD_TOOL": "rclone" + }, + "adr": { + "dae": { + "tags": "_rclone" + } + } + }, + "rclone,fp32": { + "env": { + "CM_DOWNLOAD_URL": "mlc-inference:mlcommons-inference-wg-s3/stable_diffusion_fp32" + } + }, + "rclone,fp16": { + "env": { + "CM_DOWNLOAD_URL": "mlc-inference:mlcommons-inference-wg-s3/stable_diffusion_fp16" + } } } } diff --git a/cm-mlops/script/get-ml-model-stable-diffusion/customize.py b/cm-mlops/script/get-ml-model-stable-diffusion/customize.py index d351ba35cb..6f001eb274 100644 --- a/cm-mlops/script/get-ml-model-stable-diffusion/customize.py +++ b/cm-mlops/script/get-ml-model-stable-diffusion/customize.py @@ -18,7 +18,6 @@ def postprocess(i): env = i['env'] env['SDXL_CHECKPOINT_PATH'] = env['CM_ML_MODEL_PATH'] - env['CM_ML_MODEL_PATH'] = env['SDXL_CHECKPOINT_PATH'] env['CM_GET_DEPENDENT_CACHED_PATH'] = env['CM_ML_MODEL_PATH'] return {'return':0} diff --git a/cm-mlops/script/get-mlperf-inference-intel-scratch-space/_cm.json b/cm-mlops/script/get-mlperf-inference-intel-scratch-space/_cm.json new file mode 100644 index 0000000000..3b2b650973 --- /dev/null +++ b/cm-mlops/script/get-mlperf-inference-intel-scratch-space/_cm.json @@ -0,0 +1,48 @@ +{ + "alias": "get-mlperf-inference-intel-scratch-space", + "automation_alias": "script", + "automation_uid": "5b4e0237da074764", + "cache": true, + "category": "MLPerf benchmark support", + "deps": [], + "docker": { + "run": false + }, + "input_description": {}, + "input_mapping": { + "scratch_path": "MLPERF_INTEL_SCRATCH_PATH" + }, + "new_env_keys": [ + "CM_INTEL_MLPERF_SCRATCH_PATH", + "CM_INTEL_SCRATCH_SPACE_VERSION" + ], + "new_state_keys": [], + "post_deps": [], + "posthook_deps": [], + "prehook_deps": [], + "tags": [ + "get", + "mlperf", + "inference", + "intel", + "scratch", + "space" + ], + "uid": "e83fca30851f45ef", + "variations": { + "version.#": { + "group": "version", + "env": { + "CM_INTEL_SCRATCH_SPACE_VERSION": "#" + } + }, + "version.4_0": { + "group": "version", + "default": true, + "env": { + "CM_INTEL_SCRATCH_SPACE_VERSION": "4_0" + } + } + }, + "versions": {} +} diff --git a/cm-mlops/script/get-mlperf-inference-intel-scratch-space/customize.py b/cm-mlops/script/get-mlperf-inference-intel-scratch-space/customize.py new file mode 100644 index 0000000000..37d9f4a5ed --- /dev/null +++ b/cm-mlops/script/get-mlperf-inference-intel-scratch-space/customize.py @@ -0,0 +1,27 @@ +from cmind import utils +import os + +def preprocess(i): + + os_info = i['os_info'] + + env = i['env'] + + meta = i['meta'] + + automation = i['automation'] + + quiet = (env.get('CM_QUIET', False) == 'yes') + + if env.get('CM_INTEL_MLPERF_SCRATCH_PATH', '') == '': + env['CM_INTEL_MLPERF_SCRATCH_PATH'] = os.getcwd() + + return {'return':0} + +def postprocess(i): + + env = i['env'] + + env['CM_GET_DEPENDENT_CACHED_PATH'] = env['CM_INTEL_MLPERF_SCRATCH_PATH'] + + return {'return':0} diff --git a/cm-mlops/script/get-mlperf-inference-intel-scratch-space/run.bat b/cm-mlops/script/get-mlperf-inference-intel-scratch-space/run.bat new file mode 100644 index 0000000000..648302ca71 --- /dev/null +++ b/cm-mlops/script/get-mlperf-inference-intel-scratch-space/run.bat @@ -0,0 +1 @@ +rem native script diff --git a/cm-mlops/script/get-mlperf-inference-intel-scratch-space/run.sh b/cm-mlops/script/get-mlperf-inference-intel-scratch-space/run.sh new file mode 100644 index 0000000000..eb5ce24565 --- /dev/null +++ b/cm-mlops/script/get-mlperf-inference-intel-scratch-space/run.sh @@ -0,0 +1,32 @@ +#!/bin/bash + +#CM Script location: ${CM_TMP_CURRENT_SCRIPT_PATH} + +#To export any variable +#echo "VARIABLE_NAME=VARIABLE_VALUE" >>tmp-run-env.out + +#${CM_PYTHON_BIN_WITH_PATH} contains the path to python binary if "get,python" is added as a dependency + + + +function exit_if_error() { + test $? -eq 0 || exit $? +} + +function run() { + echo "Running: " + echo "$1" + echo "" + if [[ ${CM_FAKE_RUN} != 'yes' ]]; then + eval "$1" + exit_if_error + fi +} + +#Add your run commands here... +# run "$CM_RUN_CMD" + +scratch_path=${CM_NVIDIA_MLPERF_SCRATCH_PATH} +mkdir -p ${scratch_path}/data +mkdir -p ${scratch_path}/preprocessed_data +mkdir -p ${scratch_path}/models diff --git a/cm-mlops/script/get-mlperf-inference-nvidia-scratch-space/_cm.json b/cm-mlops/script/get-mlperf-inference-nvidia-scratch-space/_cm.json index 3e37653bb5..0ff47e4b8b 100644 --- a/cm-mlops/script/get-mlperf-inference-nvidia-scratch-space/_cm.json +++ b/cm-mlops/script/get-mlperf-inference-nvidia-scratch-space/_cm.json @@ -7,11 +7,12 @@ "deps": [], "input_description": {}, "input_mapping": { - "scratch_path": "MLPERF_SCRATCH_PATH" + "scratch_path": "CM_NVIDIA_MLPERF_SCRATCH_PATH" }, "new_env_keys": [ "CM_NVIDIA_MLPERF_SCRATCH_PATH", - "MLPERF_SCRATCH_PATH" + "MLPERF_SCRATCH_PATH", + "CM_NVIDIA_SCRATCH_SPACE_VERSION" ], "new_state_keys": [], "post_deps": [], @@ -26,7 +27,21 @@ "space" ], "uid": "0b2bec8b29fb4ab7", - "variations": {}, + "variations": { + "version.#": { + "group": "version", + "env": { + "CM_NVIDIA_SCRATCH_SPACE_VERSION": "#" + } + }, + "version.4_0": { + "group": "version", + "default": true, + "env": { + "CM_NVIDIA_SCRATCH_SPACE_VERSION": "4_0" + } + } + }, "versions": {}, "docker": { "run": false diff --git a/cm-mlops/script/get-mlperf-inference-nvidia-scratch-space/customize.py b/cm-mlops/script/get-mlperf-inference-nvidia-scratch-space/customize.py index 92a9d839d4..1bfa6c9580 100644 --- a/cm-mlops/script/get-mlperf-inference-nvidia-scratch-space/customize.py +++ b/cm-mlops/script/get-mlperf-inference-nvidia-scratch-space/customize.py @@ -13,10 +13,11 @@ def preprocess(i): quiet = (env.get('CM_QUIET', False) == 'yes') - if env.get('MLPERF_SCRATCH_PATH','') != '': - env['CM_NVIDIA_MLPERF_SCRATCH_PATH'] = env['MLPERF_SCRATCH_PATH'] - else: - env['CM_NVIDIA_MLPERF_SCRATCH_PATH'] = os.getcwd() + if env.get('CM_NVIDIA_MLPERF_SCRATCH_PATH', '') == '': + if env.get('MLPERF_SCRATCH_PATH','') != '': + env['CM_NVIDIA_MLPERF_SCRATCH_PATH'] = env['MLPERF_SCRATCH_PATH'] + else: + env['CM_NVIDIA_MLPERF_SCRATCH_PATH'] = os.getcwd() return {'return':0} @@ -25,5 +26,6 @@ def postprocess(i): env = i['env'] env['MLPERF_SCRATCH_PATH'] = env['CM_NVIDIA_MLPERF_SCRATCH_PATH'] + env['CM_GET_DEPENDENT_CACHED_PATH'] = env['CM_NVIDIA_MLPERF_SCRATCH_PATH'] return {'return':0} diff --git a/cm-mlops/script/get-mlperf-inference-results-dir/_cm.json b/cm-mlops/script/get-mlperf-inference-results-dir/_cm.json new file mode 100644 index 0000000000..3e9eb912b5 --- /dev/null +++ b/cm-mlops/script/get-mlperf-inference-results-dir/_cm.json @@ -0,0 +1,48 @@ +{ + "alias": "get-mlperf-inference-results-dir", + "automation_alias": "script", + "automation_uid": "5b4e0237da074764", + "cache": true, + "category": "MLPerf benchmark support", + "deps": [], + "docker": { + "run": false + }, + "input_description": {}, + "input_mapping": { + "results_dir": "CM_MLPERF_INFERENCE_RESULTS_DIR" + }, + "new_env_keys": [ + "CM_MLPERF_INFERENCE_RESULTS_DIR", + "CM_MLPERF_INFERENCE_RESULTS_VERSION" + ], + "new_state_keys": [], + "post_deps": [], + "posthook_deps": [], + "prehook_deps": [], + "tags": [ + "get", + "mlperf", + "inference", + "results", + "dir", + "directory" + ], + "uid": "84f3c5aad5e1444b", + "variations": { + "version.#": { + "group": "version", + "env": { + "CM_MLPERF_INFERENCE_RESULTS_VERSION": "#" + } + }, + "version.4_0": { + "group": "version", + "default": true, + "env": { + "CM_MLPERF_INFERENCE_RESULTS_VERSION": "4_0" + } + } + }, + "versions": {} +} diff --git a/cm-mlops/script/get-mlperf-inference-results-dir/customize.py b/cm-mlops/script/get-mlperf-inference-results-dir/customize.py new file mode 100644 index 0000000000..8f013816a1 --- /dev/null +++ b/cm-mlops/script/get-mlperf-inference-results-dir/customize.py @@ -0,0 +1,27 @@ +from cmind import utils +import os + +def preprocess(i): + + os_info = i['os_info'] + + env = i['env'] + + meta = i['meta'] + + automation = i['automation'] + + quiet = (env.get('CM_QUIET', False) == 'yes') + + if env.get('CM_MLPERF_INFERENCE_RESULTS_DIR','') == '': + env['CM_MLPERF_INFERENCE_RESULTS_DIR'] = os.getcwd() + + return {'return':0} + +def postprocess(i): + + env = i['env'] + + env['CM_GET_DEPENDENT_CACHED_PATH'] = env['CM_MLPERF_INFERENCE_RESULTS_DIR'] + + return {'return':0} diff --git a/cm-mlops/script/get-mlperf-inference-results-dir/run.bat b/cm-mlops/script/get-mlperf-inference-results-dir/run.bat new file mode 100644 index 0000000000..648302ca71 --- /dev/null +++ b/cm-mlops/script/get-mlperf-inference-results-dir/run.bat @@ -0,0 +1 @@ +rem native script diff --git a/cm-mlops/script/get-mlperf-inference-results-dir/run.sh b/cm-mlops/script/get-mlperf-inference-results-dir/run.sh new file mode 100644 index 0000000000..eb5ce24565 --- /dev/null +++ b/cm-mlops/script/get-mlperf-inference-results-dir/run.sh @@ -0,0 +1,32 @@ +#!/bin/bash + +#CM Script location: ${CM_TMP_CURRENT_SCRIPT_PATH} + +#To export any variable +#echo "VARIABLE_NAME=VARIABLE_VALUE" >>tmp-run-env.out + +#${CM_PYTHON_BIN_WITH_PATH} contains the path to python binary if "get,python" is added as a dependency + + + +function exit_if_error() { + test $? -eq 0 || exit $? +} + +function run() { + echo "Running: " + echo "$1" + echo "" + if [[ ${CM_FAKE_RUN} != 'yes' ]]; then + eval "$1" + exit_if_error + fi +} + +#Add your run commands here... +# run "$CM_RUN_CMD" + +scratch_path=${CM_NVIDIA_MLPERF_SCRATCH_PATH} +mkdir -p ${scratch_path}/data +mkdir -p ${scratch_path}/preprocessed_data +mkdir -p ${scratch_path}/models diff --git a/cm-mlops/script/get-mlperf-inference-submission-dir/_cm.json b/cm-mlops/script/get-mlperf-inference-submission-dir/_cm.json new file mode 100644 index 0000000000..2670a7882c --- /dev/null +++ b/cm-mlops/script/get-mlperf-inference-submission-dir/_cm.json @@ -0,0 +1,48 @@ +{ + "alias": "get-mlperf-inference-submission-dir", + "automation_alias": "script", + "automation_uid": "5b4e0237da074764", + "cache": true, + "category": "MLPerf benchmark support", + "deps": [], + "docker": { + "run": false + }, + "input_description": {}, + "input_mapping": { + "results_dir": "CM_MLPERF_SUBMISSION_DIR" + }, + "new_env_keys": [ + "CM_MLPERF_INFERENCE_SUBMISSION_DIR", + "CM_MLPERF_INFERENCE_SUBMISSION_VERSION" + ], + "new_state_keys": [], + "post_deps": [], + "posthook_deps": [], + "prehook_deps": [], + "tags": [ + "get", + "mlperf", + "inference", + "submission", + "dir", + "directory" + ], + "uid": "ddf36a41d6934a7e", + "variations": { + "version.#": { + "env": { + "CM_MLPERF_INFERENCE_SUBMISSION_VERSION": "#" + }, + "group": "version" + }, + "version.4_0": { + "default": true, + "env": { + "CM_MLPERF_INFERENCE_SUBMISSION_VERSION": "4_0" + }, + "group": "version" + } + }, + "versions": {} +} diff --git a/cm-mlops/script/get-mlperf-inference-submission-dir/customize.py b/cm-mlops/script/get-mlperf-inference-submission-dir/customize.py new file mode 100644 index 0000000000..a7f885d518 --- /dev/null +++ b/cm-mlops/script/get-mlperf-inference-submission-dir/customize.py @@ -0,0 +1,27 @@ +from cmind import utils +import os + +def preprocess(i): + + os_info = i['os_info'] + + env = i['env'] + + meta = i['meta'] + + automation = i['automation'] + + quiet = (env.get('CM_QUIET', False) == 'yes') + + if env.get('CM_MLPERF_INFERENCE_SUBMISSION_DIR','') == '': + env['CM_MLPERF_INFERENCE_SUBMISSION_DIR'] = os.getcwd() + + return {'return':0} + +def postprocess(i): + + env = i['env'] + + env['CM_GET_DEPENDENT_CACHED_PATH'] = env['CM_MLPERF_INFERENCE_SUBMISSION_DIR'] + + return {'return':0} diff --git a/cm-mlops/script/get-mlperf-inference-submission-dir/run.bat b/cm-mlops/script/get-mlperf-inference-submission-dir/run.bat new file mode 100644 index 0000000000..648302ca71 --- /dev/null +++ b/cm-mlops/script/get-mlperf-inference-submission-dir/run.bat @@ -0,0 +1 @@ +rem native script diff --git a/cm-mlops/script/get-mlperf-inference-submission-dir/run.sh b/cm-mlops/script/get-mlperf-inference-submission-dir/run.sh new file mode 100644 index 0000000000..eb5ce24565 --- /dev/null +++ b/cm-mlops/script/get-mlperf-inference-submission-dir/run.sh @@ -0,0 +1,32 @@ +#!/bin/bash + +#CM Script location: ${CM_TMP_CURRENT_SCRIPT_PATH} + +#To export any variable +#echo "VARIABLE_NAME=VARIABLE_VALUE" >>tmp-run-env.out + +#${CM_PYTHON_BIN_WITH_PATH} contains the path to python binary if "get,python" is added as a dependency + + + +function exit_if_error() { + test $? -eq 0 || exit $? +} + +function run() { + echo "Running: " + echo "$1" + echo "" + if [[ ${CM_FAKE_RUN} != 'yes' ]]; then + eval "$1" + exit_if_error + fi +} + +#Add your run commands here... +# run "$CM_RUN_CMD" + +scratch_path=${CM_NVIDIA_MLPERF_SCRATCH_PATH} +mkdir -p ${scratch_path}/data +mkdir -p ${scratch_path}/preprocessed_data +mkdir -p ${scratch_path}/models diff --git a/cm-mlops/script/install-nccl-libs/_cm.yaml b/cm-mlops/script/install-nccl-libs/_cm.yaml new file mode 100644 index 0000000000..8011ab3ad7 --- /dev/null +++ b/cm-mlops/script/install-nccl-libs/_cm.yaml @@ -0,0 +1,13 @@ +alias: install-nccl-libs +automation_alias: script +automation_uid: 5b4e0237da074764 +cache: false +tags: +- install +- nccl +- libs +uid: d1c76da2adb44201 +variations: + cuda: + deps: + - tags: get,cuda diff --git a/cm-mlops/script/install-nccl-libs/customize.py b/cm-mlops/script/install-nccl-libs/customize.py new file mode 100644 index 0000000000..d12f9b3e1d --- /dev/null +++ b/cm-mlops/script/install-nccl-libs/customize.py @@ -0,0 +1,22 @@ +from cmind import utils +import os + +def preprocess(i): + + os_info = i['os_info'] + + env = i['env'] + + meta = i['meta'] + + automation = i['automation'] + + quiet = (env.get('CM_QUIET', False) == 'yes') + + return {'return':0} + +def postprocess(i): + + env = i['env'] + + return {'return':0} diff --git a/cm-mlops/script/install-nccl-libs/run-ubuntu.sh b/cm-mlops/script/install-nccl-libs/run-ubuntu.sh new file mode 100644 index 0000000000..2b380e0223 --- /dev/null +++ b/cm-mlops/script/install-nccl-libs/run-ubuntu.sh @@ -0,0 +1,2 @@ +CM_SUDO=${CM_SUDO:-sudo} +${CM_SUDO} apt install -y libnccl2=2.18.3-1+cuda${CM_CUDA_VERSION} libnccl-dev=2.18.3-1+cuda${CM_CUDA_VERSION} diff --git a/cm-mlops/script/install-nccl-libs/run.sh b/cm-mlops/script/install-nccl-libs/run.sh new file mode 100644 index 0000000000..3a584c10cf --- /dev/null +++ b/cm-mlops/script/install-nccl-libs/run.sh @@ -0,0 +1,27 @@ +#!/bin/bash + +#CM Script location: ${CM_TMP_CURRENT_SCRIPT_PATH} + +#To export any variable +#echo "VARIABLE_NAME=VARIABLE_VALUE" >>tmp-run-env.out + +#${CM_PYTHON_BIN_WITH_PATH} contains the path to python binary if "get,python" is added as a dependency + + + +function exit_if_error() { + test $? -eq 0 || exit $? +} + +function run() { + echo "Running: " + echo "$1" + echo "" + if [[ ${CM_FAKE_RUN} != 'yes' ]]; then + eval "$1" + exit_if_error + fi +} + +#Add your run commands here... +# run "$CM_RUN_CMD" diff --git a/cm-mlops/script/install-pytorch-from-src/_cm.json b/cm-mlops/script/install-pytorch-from-src/_cm.json index dc4c50caaf..7225c86ed8 100644 --- a/cm-mlops/script/install-pytorch-from-src/_cm.json +++ b/cm-mlops/script/install-pytorch-from-src/_cm.json @@ -221,7 +221,7 @@ }, "for-nvidia-mlperf-inference-v3.1-gptj": { "base": [ - "checkout.b5021ba9", + "sha.b5021ba9", "cuda" ], "env": { @@ -230,14 +230,20 @@ "deps": [ { "tags": "get,conda,_name.nvidia" - } + }, + { + "tags": "get,cmake", + "version_min": "3.18" + } ] }, "cuda": { - "deps": { - "tags": "get,cuda,cudnn", - "names": [ "cuda" ] - }, + "deps": [ + { + "tags": "get,cuda,_cudnn", + "names": [ "cuda" ] + } + ], "env": { "CUDA_HOME": "<<>>", "CUDNN_LIBRARY_PATH": "<<>>", diff --git a/cm-mlops/script/install-pytorch-from-src/run.sh b/cm-mlops/script/install-pytorch-from-src/run.sh index bdbe5638bd..cbfbb42673 100644 --- a/cm-mlops/script/install-pytorch-from-src/run.sh +++ b/cm-mlops/script/install-pytorch-from-src/run.sh @@ -1,6 +1,7 @@ #!/bin/bash export PATH=${CM_CONDA_BIN_PATH}:$PATH +export LD_LIBRARY_PATH="" #Don't use conda libs CUR_DIR=$PWD rm -rf pytorch @@ -8,13 +9,10 @@ cp -r ${CM_PYTORCH_SRC_REPO_PATH} pytorch cd pytorch rm -rf build -git submodule sync -git submodule update --init --recursive +python -m pip install -r requirements.txt if [ "${?}" != "0" ]; then exit $?; fi - -python3 -m pip install -r requirements.txt python setup.py bdist_wheel -if [ "${?}" != "0" ]; then exit $?; fi +test $? -eq 0 || exit $? cd dist -python3 -m pip install torch-2.*linux_x86_64.whl -if [ "${?}" != "0" ]; then exit $?; fi +python -m pip install torch-2.*linux_x86_64.whl +test $? -eq 0 || exit $? diff --git a/cm-mlops/script/reproduce-mlperf-inference-nvidia/_cm.yaml b/cm-mlops/script/reproduce-mlperf-inference-nvidia/_cm.yaml index 99c94dd78d..4090c76a18 100644 --- a/cm-mlops/script/reproduce-mlperf-inference-nvidia/_cm.yaml +++ b/cm-mlops/script/reproduce-mlperf-inference-nvidia/_cm.yaml @@ -101,21 +101,6 @@ deps: # Install system dependencies on a given host - tags: get,sys-utils-cm - # Detect CUDA - - names: - - cuda - tags: get,cuda,_cudnn - - # Detect Tensorrt - - names: - - tensorrt - tags: get,tensorrt - - # Build nvidia inference server - - names: - - nvidia-inference-server - tags: build,nvidia,inference,server - # Get Nvidia scratch space where data and models get downloaded - tags: get,mlperf,inference,nvidia,scratch,space names: @@ -233,6 +218,30 @@ deps: tags: get,dataset,original,openimages,_calibration + ######################################################################## + # Install openorca dataset + + - enable_if_env: + CM_MODEL: + - gptj-99 + - gptj-99.9 + CM_MLPERF_NVIDIA_HARNESS_RUN_MODE: + - preprocess_dataset + names: + - openorca-original + tags: get,dataset,original,openorca + + ######################################################################## + # Install GPTJ-6B model + - enable_if_env: + CM_MODEL: + - gptj-99 + - gptj-99.9 + CM_MLPERF_NVIDIA_HARNESS_RUN_MODE: + - download_model + names: + - gptj-model + tags: get,ml-model,gptj,_pytorch ######################################################################## # Install MLPerf inference dependencies @@ -406,9 +415,24 @@ variations: env: CM_MODEL: dlrm-v2-99.9 - gptj_: + gptj_: {} + gptj_,_build: deps: - - tags: get,generic-python-lib,_torch + - tags: install,torch,from.src,_for-nvidia-mlperf-inference-v3.1-gptj + - tags: get,cmake + version_min: "3.25.0" + + gptj_,_build_engine: + deps: + - tags: install,torch,from.src,_for-nvidia-mlperf-inference-v3.1-gptj + - tags: get,cmake + version_min: "3.25.0" + + gptj_,_run_harness: + deps: + - tags: install,torch,from.src,_for-nvidia-mlperf-inference-v3.1-gptj + - tags: get,cmake + version_min: "3.25.0" gptj-99: group: model @@ -477,6 +501,22 @@ variations: # Detect rapidjson-dev - tags: get,generic,sys-util,_rapidjson-dev + + # Detect CUDA + - names: + - cuda + tags: get,cuda,_cudnn + + # Detect Tensorrt + - names: + - tensorrt + tags: get,tensorrt + + # Build nvidia inference server + - names: + - nvidia-inference-server + tags: build,nvidia,inference,server + maxq: group: power-mode @@ -488,12 +528,18 @@ variations: env: CM_MLPERF_NVIDIA_HARNESS_MAXN: yes + preprocess-data: + alias: preprocess-data + preprocess_data: group: run-mode env: MLPERF_NVIDIA_RUN_COMMAND: preprocess_data CM_MLPERF_NVIDIA_HARNESS_RUN_MODE: preprocess_data + download-model: + alias: download-model + download_model: group: run-mode env: @@ -531,6 +577,9 @@ variations: - dlrm-v2-99 - dlrm-v2-99.9 + build-engine: + alias: build_engine + build_engine: group: run-mode default_variations: @@ -539,6 +588,21 @@ variations: MLPERF_NVIDIA_RUN_COMMAND: generate_engines CM_MLPERF_NVIDIA_HARNESS_RUN_MODE: generate_engines deps: + # Detect CUDA + - names: + - cuda + tags: get,cuda,_cudnn + + # Detect Tensorrt + - names: + - tensorrt + tags: get,tensorrt + + # Build nvidia inference server + - names: + - nvidia-inference-server + tags: build,nvidia,inference,server + - tags: reproduce,mlperf,inference,nvidia,harness,_preprocess_data inherit_variation_tags: true force_cache: true @@ -608,12 +672,29 @@ variations: env: CM_MLPERF_LOADGEN_SCENARIO: Server + run-harness: + alis: run_harness + run_harness: group: run-mode default: true default_variations: loadgen-scenario: offline deps: + # Detect CUDA + - names: + - cuda + tags: get,cuda,_cudnn + + # Detect Tensorrt + - names: + - tensorrt + tags: get,tensorrt + + # Build nvidia inference server + - names: + - nvidia-inference-server + tags: build,nvidia,inference,server - tags: reproduce,mlperf,inference,nvidia,harness,_build_engine inherit_variation_tags: true names: diff --git a/cm-mlops/script/reproduce-mlperf-inference-nvidia/customize.py b/cm-mlops/script/reproduce-mlperf-inference-nvidia/customize.py index 964b571504..96a6b20e26 100644 --- a/cm-mlops/script/reproduce-mlperf-inference-nvidia/customize.py +++ b/cm-mlops/script/reproduce-mlperf-inference-nvidia/customize.py @@ -158,10 +158,8 @@ def preprocess(i): cmds.append(f"mkdir -p {os.path.dirname(fp32_model_path)}") if not os.path.exists(fp32_model_path): - cmds.append(f"ln -sf {env['GPTJ_CHECKPOINT_DIR']} {fp32_model_path}") + cmds.append(f"ln -sf {env['GPTJ_CHECKPOINT_PATH']} {fp32_model_path}") - if not os.path.exists(fp8_model_path): - cmds.append(f"ln -sf {env['CM_ML_MODEL_BERT_LARGE_INT8_PATH']} {fp8_model_path}") model_name = "gptj" model_path = fp8_model_path #cmds.append(f"make prebuild") diff --git a/cm-mlops/script/run-all-mlperf-models/run-mobilenet-models.sh b/cm-mlops/script/run-all-mlperf-models/run-mobilenet-models.sh index 8dcc9db1a5..41497d56d2 100644 --- a/cm-mlops/script/run-all-mlperf-models/run-mobilenet-models.sh +++ b/cm-mlops/script/run-all-mlperf-models/run-mobilenet-models.sh @@ -24,10 +24,10 @@ function run() { } POWER=" --power=yes --adr.mlperf-power-client.power_server=192.168.0.15 --adr.mlperf-power-client.port=4940 " POWER="" -extra_option=" --adr.mlperf-inference-implementation.compressed_dataset=on" extra_option="" extra_tags="" -extra_tags=",_only-fp32" +#extra_option=" --adr.mlperf-inference-implementation.compressed_dataset=on" +#extra_tags=",_only-fp32" #Add your run commands here... diff --git a/cm-mlops/script/run-docker-container/customize.py b/cm-mlops/script/run-docker-container/customize.py index f01f9d1eb9..79d1e52887 100644 --- a/cm-mlops/script/run-docker-container/customize.py +++ b/cm-mlops/script/run-docker-container/customize.py @@ -10,7 +10,7 @@ def preprocess(i): env = i['env'] - interactive = env.get('CM_DOCKER_INTERACTIVE','') + interactive = env.get('CM_DOCKER_INTERACTIVE_MODE','') if interactive: env['CM_DOCKER_DETACHED_MODE']='no' diff --git a/cm-mlops/script/run-mlperf-inference-mobilenet-models/customize.py b/cm-mlops/script/run-mlperf-inference-mobilenet-models/customize.py index 6453229781..10f2eb7a54 100644 --- a/cm-mlops/script/run-mlperf-inference-mobilenet-models/customize.py +++ b/cm-mlops/script/run-mlperf-inference-mobilenet-models/customize.py @@ -171,6 +171,17 @@ def preprocess(i): if env.get('CM_TEST_ONE_RUN', '') == "yes": return {'return':0} + clean_input = { + 'action': 'rm', + 'automation': 'cache', + 'tags': 'get,preprocessed,dataset,_for.mobilenet', + 'quiet': True, + 'v': verbose, + 'f': 'True' + } + r = cmind.access(clean_input) + #if r['return'] > 0: + # return r return {'return':0} def postprocess(i): diff --git a/cm/CHANGES.md b/cm/CHANGES.md index e7c54fb830..4d2f831d8e 100644 --- a/cm/CHANGES.md +++ b/cm/CHANGES.md @@ -1,4 +1,4 @@ -## V1.6.0.1 +## V1.6.1 - improving --help for common automations and CM scripts (automation recipes) - fixing a few minor bugs diff --git a/cm/README.md b/cm/README.md index 85897743ce..49d042d324 100644 --- a/cm/README.md +++ b/cm/README.md @@ -51,7 +51,6 @@ to help this community effort!* #### CM human-friendly command line - ```bash pip install cmind @@ -119,12 +118,9 @@ cmr "reproduce paper micro-2023 victima _run" ``` - - #### CM unified Python API - ```python import cmind @@ -133,9 +129,6 @@ output=cmind.access({'action':'run', 'automation':'script', 'input':'computer_mouse.jpg'}) if output['return']==0: print (output) ``` - - - #### Examples of modular containers and GitHub actions with CM commands diff --git a/cm/cmind/__init__.py b/cm/cmind/__init__.py index 8fdeda093a..9256608d6c 100644 --- a/cm/cmind/__init__.py +++ b/cm/cmind/__init__.py @@ -1,4 +1,4 @@ -__version__ = "1.6.1" +__version__ = "1.6.1.1" from cmind.core import access from cmind.core import error