From b14f5af1e4c33b2643ec483335f519ddbe7ca3a5 Mon Sep 17 00:00:00 2001 From: Ian Date: Mon, 19 Aug 2024 14:19:10 -0700 Subject: [PATCH 1/3] minimal model ladder documentation --- docs/model_ladder.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 docs/model_ladder.md diff --git a/docs/model_ladder.md b/docs/model_ladder.md new file mode 100644 index 000000000..614f8af99 --- /dev/null +++ b/docs/model_ladder.md @@ -0,0 +1,12 @@ +# Model Ladder + +The model ladder is a set of scripts that help you easily run models over a standardized set of parameter sizes and token multipliers. + +## setup +You just probably only need beaker ganty + +## example usage +For example this will train you a 150M model on the dolma17 data mix with a token multiplier of 20 * number of parameters (one chinchilla cuz who doesn't like more obscurity in naming) with a specifed run name and getting all the data from s3 +``` +scripts/beaker/ladder-launch.sh 1 --model 150M --data dolma17 --length 1xC --name testing-out-model-ladder --s3 +``` \ No newline at end of file From 024d3d2af9f44f0c372ef4f027430a9545e21b8e Mon Sep 17 00:00:00 2001 From: Ian Date: Mon, 19 Aug 2024 14:22:51 -0700 Subject: [PATCH 2/3] data mixes --- docs/model_ladder.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/model_ladder.md b/docs/model_ladder.md index 614f8af99..3aadc64b7 100644 --- a/docs/model_ladder.md +++ b/docs/model_ladder.md @@ -9,4 +9,7 @@ You just probably only need beaker ganty For example this will train you a 150M model on the dolma17 data mix with a token multiplier of 20 * number of parameters (one chinchilla cuz who doesn't like more obscurity in naming) with a specifed run name and getting all the data from s3 ``` scripts/beaker/ladder-launch.sh 1 --model 150M --data dolma17 --length 1xC --name testing-out-model-ladder --s3 -``` \ No newline at end of file +``` + +## data mixes +Data mixes are defined in [named_data_mixes.py](olmo/data/named_data_mixes.py). \ No newline at end of file From 099c311cc5779808860f18c184752294a69c9710 Mon Sep 17 00:00:00 2001 From: Ian Date: Mon, 19 Aug 2024 14:25:50 -0700 Subject: [PATCH 3/3] get args --- docs/model_ladder.md | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/docs/model_ladder.md b/docs/model_ladder.md index 3aadc64b7..9719c83e3 100644 --- a/docs/model_ladder.md +++ b/docs/model_ladder.md @@ -12,4 +12,30 @@ scripts/beaker/ladder-launch.sh 1 --model 150M --data dolma17 --length 1xC --nam ``` ## data mixes -Data mixes are defined in [named_data_mixes.py](olmo/data/named_data_mixes.py). \ No newline at end of file +Data mixes are defined in [named_data_mixes.py](olmo/data/named_data_mixes.py). + +## detailed usage + +### train command +``` +usage: ladder.py train [-h] --model MODEL --data DATA [--length LENGTH] --name + NAME [--s3 | --no-s3] [--wandb | --no-wandb] + [--read_location READ_LOCATION] + [--write_location WRITE_LOCATION] [--save_overwrite] + [--load_path LOAD_PATH] [--eval_on_load] + +options: + -h, --help show this help message and exit + --model MODEL + --data DATA + --length LENGTH + --name NAME + --s3, --no-s3 read data from S3, write checkpoints to S3 (default: + False) + --wandb, --no-wandb create a run in wandb (default: True) + --read_location READ_LOCATION + --write_location WRITE_LOCATION + --save_overwrite + --load_path LOAD_PATH + --eval_on_load +``` \ No newline at end of file