Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge from CTuning #1094

Merged
merged 32 commits into from
Feb 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
5ed35a3
Testing docker deps in action
arjunsuresh Feb 3, 2024
fa47190
Testing docker deps in action
arjunsuresh Feb 3, 2024
e6fabd4
Fix version for cmake install
arjunsuresh Feb 3, 2024
e96f5ce
Fix version for cmake install
arjunsuresh Feb 3, 2024
52e63b2
Add submodule sync for install-pytorch-from-src
arjunsuresh Feb 3, 2024
de20946
Use cmake 3.25 for pytorch cuda build
arjunsuresh Feb 3, 2024
4abfb24
run cudnn and tensorrt inside docker
arjunsuresh Feb 4, 2024
02f5de4
run cudnn and tensorrt inside docker
arjunsuresh Feb 4, 2024
dcba0e8
Fixes for pytorch build inside docker
arjunsuresh Feb 4, 2024
6fff4b5
Merge branch 'mlcommons:master' into master
arjunsuresh Feb 4, 2024
8e8588c
Dont use conda for nvidia-pytorch build
arjunsuresh Feb 4, 2024
5bdda55
fixed "cm pull repo" if repo already exists (use git pull instead of …
gfursin Feb 4, 2024
3e132ba
release v1.6.2
gfursin Feb 4, 2024
85bbc59
Add parameter support for nvidia gptj
arjunsuresh Feb 4, 2024
c64822a
Added skip postprocessing for nvidia gptj
arjunsuresh Feb 4, 2024
fe385d4
fixing links
gfursin Feb 4, 2024
fd018d4
Added libkineto build from src for Nvidia gptj
arjunsuresh Feb 4, 2024
75c6571
Cleanup for nvidia-mlperf-inference
arjunsuresh Feb 4, 2024
f035ffc
Cleanup for nvidia-mlperf-inference
arjunsuresh Feb 4, 2024
045e188
Support shm size for docker
arjunsuresh Feb 5, 2024
bf6ad7e
Support extra_docker_run_args
arjunsuresh Feb 5, 2024
032f590
Fix typo in gptj model download
arjunsuresh Feb 5, 2024
2087553
* added call to customize from GUI to create customized GUI
gfursin Feb 5, 2024
07781e6
Merge branch 'master' of https://github.com/ctuning/mlcommons-ck
gfursin Feb 5, 2024
557b6ac
Added "cfg" automation
gfursin Feb 5, 2024
3b22452
Make gptj download a prehook deps to reuse already downloaded one
arjunsuresh Feb 5, 2024
2a79a83
Use version4.0 scratch space for nvidia-mlperf
arjunsuresh Feb 5, 2024
4c29bda
Use version4.0 scratch space for nvidia-mlperf
arjunsuresh Feb 5, 2024
bc5e634
fixing GUI to support latest StreamLit
gfursin Feb 5, 2024
f1b2447
cleaned up challenges with v4.0 inference dummy
gfursin Feb 5, 2024
c0f3aad
Use extra args for nvidia docker
arjunsuresh Feb 5, 2024
f6b409c
Support dtype for llama2 reference implementation
arjunsuresh Feb 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion ck/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,5 +69,5 @@
* @filven
* @ValouBambou

See more acknowledgments at the end of this [article](https://arxiv.org/abs/2011.01149)
See more acknowledgments at the end of this [article](https://doi.org/10.1098/rsta.2020.0211)
describing Collective Knowledge v1 concepts.
4 changes: 2 additions & 2 deletions ck/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<br>
<br>

**Note that this directory is in archive mode since the [Collective Knowledge framework (v1 and v2)](https://arxiv.org/abs/2011.01149)
**Note that this directory is in archive mode since the [Collective Knowledge framework (v1 and v2)](https://doi.org/10.1098/rsta.2020.0211)
is now officially discontinued in favour of the new, light-weight, non-intrusive and technology-agnostic
[Collective Mind workflow automation language](https://doi.org/10.5281/zenodo.8105339) being developed, supported
and maintained by the [MLCommons](https://mlcommons.org), [cTuning.org](https://cTuning.org) and [cKnowledge.org](https://cKnowledge.org).**
Expand Down Expand Up @@ -280,5 +280,5 @@ The community provides Docker containers to test CK and components using differe

We would like to thank all [contributors](https://github.com/mlcommons/ck/blob/master/CONTRIBUTING.md)
and [collaborators](https://cKnowledge.org/partners.html) for their support, fruitful discussions,
and useful feedback! See more acknowledgments in the [CK journal article](https://arxiv.org/abs/2011.01149)
and useful feedback! See more acknowledgments in the [CK journal article](https://doi.org/10.1098/rsta.2020.0211)
and [ACM TechTalk'21](https://www.youtube.com/watch?v=7zpeIVwICa4).
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ hardware.
* [Apache TVM](https://tvm.apache.org)
* CK "plug&play" automation framework: [GitHub](https://github.com/ctuning/ck),
[Motivation](https://www.youtube.com/watch?v=7zpeIVwICa4),
[ArXiv](https://arxiv.org/abs/2011.01149),
[journal paper](https://doi.org/10.1098/rsta.2020.0211),
[automation actions](https://github.com/mlcommons/ck/tree/master/ck/repo/module),
[MLOps components](https://github.com/mlcommons/ck-mlops)
* [ACM REQUEST-ASPLOS'18: the 1st Reproducible Tournament on Pareto-efficient Image Classification](https://cknow.io/c/event/repro-request-asplos2018)
Expand Down
2 changes: 1 addition & 1 deletion ck/docs/src/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Project overview

* Philosophical Transactions of the Royal Society: [paper](https://arxiv.org/abs/2011.01149), [shorter pre-print](https://arxiv.org/abs/2006.07161)
* Philosophical Transactions of the Royal Society: [paper](https://doi.org/10.1098/rsta.2020.0211), [shorter pre-print](https://arxiv.org/abs/2006.07161)

[<img src="https://img.youtube.com/vi/7zpeIVwICa4/0.jpg" width="320">](https://youtu.be/7zpeIVwICa4)

Expand Down
9 changes: 9 additions & 0 deletions cm-mlops/automation/cfg/_cm.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"alias": "cfg",
"automation_alias": "automation",
"automation_uid": "bbeb15d8f0a944a4",
"tags": [
"automation"
],
"uid": "88dce9c160324c5d"
}
52 changes: 52 additions & 0 deletions cm-mlops/automation/cfg/module.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
import os

from cmind.automation import Automation
from cmind import utils

class CAutomation(Automation):
"""
Automation actions
"""

############################################################
def __init__(self, cmind, automation_file):
super().__init__(cmind, __file__)

############################################################
def test(self, i):
"""
Test automation

Args:
(CM input dict):

(out) (str): if 'con', output to console

automation (str): automation as CM string object

parsed_automation (list): prepared in CM CLI or CM access function
[ (automation alias, automation UID) ] or
[ (automation alias, automation UID), (automation repo alias, automation repo UID) ]

(artifact) (str): artifact as CM string object

(parsed_artifact) (list): prepared in CM CLI or CM access function
[ (artifact alias, artifact UID) ] or
[ (artifact alias, artifact UID), (artifact repo alias, artifact repo UID) ]

...

Returns:
(CM return dict):

* return (int): return code == 0 if no error and >0 if error
* (error) (str): error string if return>0

* Output from this automation action

"""

import json
print (json.dumps(i, indent=2))

return {'return':0}
21 changes: 21 additions & 0 deletions cm-mlops/automation/script/module.py
Original file line number Diff line number Diff line change
Expand Up @@ -1312,6 +1312,27 @@ def run(self, i):
if "add_deps_recursive" in versions_meta:
self._merge_dicts_with_tags(add_deps_recursive, versions_meta['add_deps_recursive'])

# Run chain of docker dependencies if current run cmd is from inside a docker container
docker_deps = []
if i.get('docker_run_deps'):
docker_meta = meta.get('docker')
if docker_meta:
docker_deps = docker_meta.get('deps')
docker_deps = [ dep for dep in docker_deps if not dep.get('skip_inside_docker', False) ]
if len(docker_deps)>0:

if verbose:
print (recursion_spaces + ' - Checkingdocker run dependencies on other CM scripts:')

r = self._call_run_deps(docker_deps, self.local_env_keys, local_env_keys_from_meta, env, state, const, const_state, add_deps_recursive,
recursion_spaces + extra_recursion_spaces,
remembered_selections, variation_tags_string, False, debug_script_tags, verbose, show_time, extra_recursion_spaces, run_state)
if r['return']>0: return r

if verbose:
print (recursion_spaces + ' - Processing env after docker run dependencies ...')

update_env_with_values(env)

# Check chain of dependencies on other CM scripts
if len(deps)>0:
Expand Down
38 changes: 35 additions & 3 deletions cm-mlops/automation/script/module_misc.py
Original file line number Diff line number Diff line change
Expand Up @@ -1379,13 +1379,15 @@ def dockerfile(i):

i_run_cmd = r['run_cmd']

docker_run_cmd_prefix = i.get('docker_run_cmd_prefix', docker_settings.get('run_cmd_prefix', ''))

r = regenerate_script_cmd({'script_uid':script_uid,
'script_alias':script_alias,
'run_cmd':i_run_cmd,
'tags':tags,
'fake_run':True,
'docker_settings':docker_settings,
'docker_run_cmd_prefix':i.get('docker_run_cmd_prefix','')})
'docker_run_cmd_prefix':docker_run_cmd_prefix})
if r['return']>0: return r

run_cmd = r['run_cmd_string']
Expand Down Expand Up @@ -1469,6 +1471,21 @@ def dockerfile(i):

return {'return':0}

def get_container_path(value):
path_split = value.split(os.sep)
if len(path_split) == 1:
return value

new_value = ''
if "cache" in path_split and "local" in path_split:
new_path_split = [ "", "home", "cmuser" ]
repo_entry_index = path_split.index("local")
new_path_split += path_split[repo_entry_index:]
return "/".join(new_path_split)

return value


############################################################
def docker(i):
"""
Expand Down Expand Up @@ -1629,6 +1646,7 @@ def docker(i):
if c_input in i:
env[docker_input_mapping[c_input]] = i[c_input]

container_env_string = '' # env keys corresponding to container mounts are explicitly passed to the container run cmd
for index in range(len(mounts)):
mount = mounts[index]

Expand Down Expand Up @@ -1663,7 +1681,8 @@ def docker(i):
if tmp_values:
for tmp_value in tmp_values:
if tmp_value in env:
new_container_mount = env[tmp_value]
new_container_mount = get_container_path(env[tmp_value])
container_env_string += "--env.{}={} ".format(tmp_value, new_container_mount)
else:# we skip those mounts
mounts[index] = None
skip = True
Expand Down Expand Up @@ -1694,6 +1713,8 @@ def docker(i):

docker_pre_run_cmds = i.get('docker_pre_run_cmds', []) + docker_settings.get('pre_run_cmds', [])

docker_run_cmd_prefix = i.get('docker_run_cmd_prefix', docker_settings.get('run_cmd_prefix', ''))

all_gpus = i.get('docker_all_gpus', docker_settings.get('all_gpus'))

device = i.get('docker_device', docker_settings.get('device'))
Expand All @@ -1702,6 +1723,10 @@ def docker(i):

port_maps = i.get('docker_port_maps', docker_settings.get('port_maps', []))

shm_size = i.get('docker_shm_size', docker_settings.get('shm_size', ''))

extra_run_args = i.get('docker_extra_run_args', docker_settings.get('extra_run_args', ''))

if detached == '':
detached = docker_settings.get('detached', '')

Expand Down Expand Up @@ -1729,7 +1754,8 @@ def docker(i):
'docker_run_cmd_prefix':i.get('docker_run_cmd_prefix','')})
if r['return']>0: return r

run_cmd = r['run_cmd_string']
run_cmd = r['run_cmd_string'] + ' ' + container_env_string + ' --docker_run_deps '

env['CM_RUN_STATE_DOCKER'] = True

if docker_settings.get('mount_current_dir','')=='yes':
Expand Down Expand Up @@ -1781,6 +1807,12 @@ def docker(i):
if port_maps:
cm_docker_input['port_maps'] = port_maps

if shm_size != '':
cm_docker_input['shm_size'] = shm_size

if extra_run_args != '':
cm_docker_input['extra_run_args'] = extra_run_args

print ('')


Expand Down
27 changes: 27 additions & 0 deletions cm-mlops/automation/utils/module.py
Original file line number Diff line number Diff line change
Expand Up @@ -851,3 +851,30 @@ def prune_input(self, i):

return {'return':0, 'new_input':i_run_cmd_arc}


##############################################################################
def uid(self, i):
"""
Generate CM UID.

Args:
(CM input dict): empty dict

Returns:
(CM return dict):

* return (int): return code == 0 if no error and >0 if error
* (error) (str): error string if return>0

* uid (str): CM UID
"""

console = i.get('out') == 'con'

r = utils.gen_uid()

if console:
print (r['uid'])

return r

Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,7 @@ and add derived metrics such as result/No of cores, power efficiency, device cos

Add clock speed as a third dimension to graphs and improve Bar graph visualization.

Join our public [Discord server](https://discord.gg/JjWNWXKxwT) and/or
our [weekly conf-calls](https://docs.google.com/document/d/1zMNK1m_LhWm6jimZK6YE05hu4VH9usdbKJ3nBy-ZPAw/edit)
to discuss this challenge with the organizers.
Join our public [Discord server](https://discord.gg/JjWNWXKxwT) to discuss this challenge with the organizers.

Read [this documentation](https://github.com/mlcommons/ck/blob/master/docs/mlperf/inference/README.md)
to run reference implementations of MLPerf inference benchmarks
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,7 @@

Connect CM workflows to run MLPerf inference benchmarks with [OpenBenchmarking.org](https://openbenchmarking.org).

Join our public [Discord server](https://discord.gg/JjWNWXKxwT) and/or
our [weekly conf-calls](https://docs.google.com/document/d/1zMNK1m_LhWm6jimZK6YE05hu4VH9usdbKJ3nBy-ZPAw/edit)
to discuss this challenge with the organizers.
Join our public [Discord server](https://discord.gg/JjWNWXKxwT) to discuss this challenge with the organizers.

Read [this documentation](https://github.com/mlcommons/ck/blob/master/docs/mlperf/inference/README.md)
to run reference implementations of MLPerf inference benchmarks
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"alias": "connect-mlperf-inference-v3.1-with-openbenchmarking",
"automation_alias": "challenge",
"automation_uid": "3d84abd768f34e08",
"date_open": "20230704",
"date_open": "20240204",
"date_close_extension": true,
"points": 2,
"prize_short": "co-authoring white paper",
Expand All @@ -15,11 +15,7 @@
"automate",
"openbenchmarking",
"mlperf-inference",
"mlperf-inference-openbenchmarking",
"mlperf-inference-openbenchmarking",
"mlperf-inference-openbenchmarking-v3.1",
"mlperf-inference-openbenchmarking-v3.1-2023",
"v3.1"
"mlperf-inference-openbenchmarking"
],
"title": "Run MLPerf inference benchmarks via OpenBenchmarking.org",
"trophies": true,
Expand Down
4 changes: 1 addition & 3 deletions cm-mlops/challenge/connect-mlperf-with-medperf/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,7 @@ using MLPerf loadgen and MLCommons CM automation language.
See the [Nature 2023 article about MedPerf](https://www.nature.com/articles/s42256-023-00652-2)
and [ACM REP'23 keynote about CM](https://doi.org/10.5281/zenodo.8105339) to learn more about these projects.

Join our public [Discord server](https://discord.gg/JjWNWXKxwT) and/or
our [weekly conf-calls](https://docs.google.com/document/d/1zMNK1m_LhWm6jimZK6YE05hu4VH9usdbKJ3nBy-ZPAw/edit)
to discuss this challenge with the organizers.
Join our public [Discord server](https://discord.gg/JjWNWXKxwT) to discuss this challenge with the organizers.

Read [this documentation](https://github.com/mlcommons/ck/blob/master/docs/mlperf/inference/README.md)
to run reference implementations of MLPerf inference benchmarks
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"alias": "optimize-mlperf-inference-scc2023",
"automation_alias": "challenge",
"automation_uid": "3d84abd768f34e08",
"_date_close": "20231115",
"date_close": "20231115",
"date_open": "20230915",
"tags": [
"automate",
Expand Down

This file was deleted.

This file was deleted.

Loading
Loading