Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v5.0.3: Lemonade installer and examples, repo reorg, and lots more #275

Merged
merged 4 commits into from
Jan 28, 2025
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions .github/workflows/test_lemonade.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,14 @@ jobs:
- name: Run lemonade tests
shell: bash -el {0}
run: |
lemonade -i facebook/opt-125m huggingface-load llm-prompt -p "hi" --max-new-tokens 10
python test/llm_api.py
# Test CLI
lemonade -m -i facebook/opt-125m huggingface-load llm-prompt -p "hi" --max-new-tokens 10

# Test low-level APIs
python test/lemonade/llm_api.py


# Test high-level LEAP APIs
python examples/lemonade/leap_basic.py
python examples/lemonade/leap_streaming.py

9 changes: 8 additions & 1 deletion .github/workflows/test_lemonade_oga_cpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,13 @@ jobs:
env:
HF_TOKEN: "${{ secrets.HUGGINGFACE_ACCESS_TOKEN }}" # Required by OGA model_builder in OGA 0.4.0 but not future versions
run: |
# Test CLI
lemonade -i TinyPixel/small-llama2 oga-load --device cpu --dtype int4 llm-prompt -p "tell me a story" --max-new-tokens 5
python test/oga_cpu_api.py

# Test low-level APIs
python test/lemonade/oga_cpu_api.py

# Test high-level LEAP APIs
python examples/lemonade/leap_oga_cpu.py
python examples/lemonade/leap_oga_cpu_streaming.py

63 changes: 16 additions & 47 deletions .github/workflows/test_turnkey.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,11 @@ on:
branches: ["main", "canary", "refresh"]
pull_request:
branches: ["main", "canary", "refresh"]
paths:
- src/turnkeyml/**
- test/turnkey/**
- examples/turnkey/**
- .github/workflows/test_turnkey.yml

permissions:
contents: read
Expand Down Expand Up @@ -50,68 +55,32 @@ jobs:
shell: bash -el {0}
run: |
# Unit tests
python test/unit.py
python test/turnkey/unit.py

# turnkey examples
# Note: we clear the default cache location prior to each block of example runs
rm -rf ~/.cache/turnkey
python examples/api/onnx_opset.py --onnx-opset 15
python examples/api/loading_a_build.py
python examples/turnkey/api/onnx_opset.py --onnx-opset 15
python examples/turnkey/api/loading_a_build.py

rm -rf ~/.cache/turnkey
turnkey -i examples/cli/scripts/hello_world.py discover export-pytorch benchmark
turnkey -i examples/turnkey/cli/scripts/hello_world.py discover export-pytorch benchmark
rm -rf ~/.cache/turnkey
turnkey -i examples/cli/scripts/multiple_invocations.py discover export-pytorch benchmark
turnkey -i examples/turnkey/cli/scripts/multiple_invocations.py discover export-pytorch benchmark
rm -rf ~/.cache/turnkey
turnkey -i examples/cli/scripts/max_depth.py discover --max-depth 1 export-pytorch benchmark
turnkey -i examples/turnkey/cli/scripts/max_depth.py discover --max-depth 1 export-pytorch benchmark
rm -rf ~/.cache/turnkey
turnkey -i examples/cli/scripts/two_models.py discover export-pytorch benchmark
turnkey -i examples/turnkey/cli/scripts/two_models.py discover export-pytorch benchmark
rm -rf ~/.cache/turnkey
turnkey -i examples/cli/onnx/hello_world.onnx load-onnx benchmark
turnkey -i examples/turnkey/cli/onnx/hello_world.onnx load-onnx benchmark

# E2E tests
cd test/
cd test/turnkey
python cli.py
python analysis.py
- name: Test example plugins
shell: bash -el {0}
run: |
rm -rf ~/.cache/turnkey
pip install -e examples/cli/plugins/example_tool
turnkey -i examples/cli/scripts/hello_world.py discover export-pytorch example-plugin-tool benchmark
# - name: Install and Start Slurm
# if: runner.os != 'Windows'
# shell: bash -el {0}
# run: |
# sudo apt update -y
# sudo apt install slurm-wlm -y
# cp test/helpers/slurm.conf test/helpers/slurm_modified.conf
# sed -i "s/YOUR_HOSTNAME_HERE/$HOSTNAME/" test/helpers/slurm_modified.conf
# sudo mv test/helpers/slurm_modified.conf /etc/slurm/slurm.conf
# sudo service slurmd start
# sudo service slurmctld start
# sudo service munge start
# - name: Test turnkey on Slurm
# if: runner.os != 'Windows'
# shell: bash -el {0}
# run: |
# # Create conda environment for Slurm using srun (sbatch + wait)
# export SKIP_REQUIREMENTS_INSTALL="True"
# export TORCH_CPU="True"
# srun src/turnkeyml/cli/setup_venv.sh

# # Run tests on Slurm
# export TURNKEY_SLURM_USE_DEFAULT_MEMORY="True"
# turnkey -i models/selftest/linear.py --use-slurm --cache-dir local_cache discover export-pytorch
# bash test/helpers/check_slurm_output.sh slurm-2.out

# Below tests are commented out as the GitHub runner runs out of space installing the requirements
# - name: Check installation of requirements.txt and their compatibility with turnkey
# shell: bash -el {0}
# run: |
# conda create --name test-requirements python=3.8
# conda activate test-requirements
# pip install -r models/requirements.txt
# python -m pip check
# python -c "import torch_geometric"
# conda deactivate
pip install -e examples/turnkey/cli/plugins/example_tool
turnkey -i examples/turnkey/cli/scripts/hello_world.py discover export-pytorch example-plugin-tool benchmark
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@

We are on a mission to make it easy to use the most important tools in the ONNX ecosystem. TurnkeyML accomplishes this by providing no-code CLIs and low-code APIs for both general ONNX workflows with `turnkey` as well as LLMs with `lemonade`.

| [**Lemonade**](https://github.com/onnx/turnkeyml/tree/main/src/turnkeyml/llm) | [**Turnkey**](https://github.com/onnx/turnkeyml/blob/main/docs/classic_getting_started.md) |
| [**Lemonade**](https://github.com/onnx/turnkeyml/tree/main/src/turnkeyml/llm) | [**Turnkey**](https://github.com/onnx/turnkeyml/blob/main/docs/turnkey/getting_started.md) |
|:----------------------------------------------: |:-----------------------------------------------------------------: |
| Serve and benchmark LLMs on CPU, GPU, and NPU. <br/> [Click here to get started with `lemonade`.](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade_getting_started.md) | Export and optimize ONNX models for CNNs and Transformers. <br/> [Click here to get started with `turnkey`.](https://github.com/onnx/turnkeyml/blob/main/docs/classic_getting_started.md) |
| <img src="img/llm_demo.png"/> | <img src="img/classic_demo.png"/> |
| Serve and benchmark LLMs on CPU, GPU, and NPU. <br/> [Click here to get started with `lemonade`.](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/getting_started.md) | Export and optimize ONNX models for CNNs and Transformers. <br/> [Click here to get started with `turnkey`.](https://github.com/onnx/turnkeyml/blob/main/docs/turnkey/getting_started.md) |
| <img src="https://github.com/onnx/turnkeyml/blob/main/img/llm_demo.png?raw=true"/> | <img src="https://github.com/onnx/turnkeyml/blob/main/img/classic_demo.png?raw=true"/> |


## How It Works
Expand Down
10 changes: 5 additions & 5 deletions docs/code.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
The TurnkeyML source code has a few major top-level directories:
- `docs`: documentation for the entire project.
- `examples`: example scripts for use with the TurnkeyML tools.
- `examples/cli`: tutorial series starting in `examples/cli/readme.md` to help learn the `turnkey` CLI.
- `examples/cli/scripts`: example scripts that can be fed as input into the `turnkey` CLI. These scripts each have a docstring that recommends one or more `turnkey` CLI commands to try out.
- `examples/api`: examples scripts that invoke `Tools` via APIs.
- `examples/turnkey/cli`: tutorial series starting in `examples/turnkey/cli/readme.md` to help learn the `turnkey` CLI.
- `examples/turnkey/cli/scripts`: example scripts that can be fed as input into the `turnkey` CLI. These scripts each have a docstring that recommends one or more `turnkey` CLI commands to try out.
- `examples/turnkey/api`: examples scripts that invoke `Tools` via APIs.
- `models`: the corpora of models that makes up the TurnkeyML models (see [the models readme](https://github.com/onnx/turnkeyml/blob/main/models/readme.md)).
- Each subdirectory under `models` represents a corpus of models pulled from somewhere on the internet. For example, `models/torch_hub` is a corpus of models from [Torch Hub](https://github.com/pytorch/hub).
- `src/turnkeyml`: source code for the TurnkeyML package.
Expand All @@ -20,8 +20,8 @@ The TurnkeyML source code has a few major top-level directories:
- `src/turnkeyml/state.py`: implements the `State` class.
- `src/turnkeyml/files_api.py`: implements the `evaluate_files()` API, which is the top-level API called by the CLI.
- `test`: tests for the TurnkeyML tools.
- `test/analysis.py`: tests focusing on the `discover` `Tool`.
- `test/cli.py`: tests focusing on top-level CLI features.
- `test/turnkey/analysis.py`: tests focusing on the `discover` `Tool`.
- `test/turnkey/cli.py`: tests focusing on top-level CLI features.

## Tool Classes

Expand Down
2 changes: 1 addition & 1 deletion docs/contribute.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ We require the following naming scheme:

### Example

See the [example_tool](https://github.com/onnx/turnkeyml/tree/main/examples/cli/plugins/example_tool) plugin for an example.
See the [example_tool](https://github.com/onnx/turnkeyml/tree/main/examples/turnkey/cli/plugins/example_tool) plugin for an example.

The `__init__.py` file with its `implements` dictionary looks like:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,16 @@ That command will run a few warmup iterations, then a few generation iterations

The prompt size, number of output tokens, and number iterations are all parameters. Learn more by running `lemonade huggingface-bench -h`.

## Memory Usage

The peak memory used by the lemonade build is captured in the build output. To capture more granular
memory usage information, use the `--memory` flag. For example:

`lemonade -i facebook/opt-125m --memory huggingface-load huggingface-bench`

In this case a `memory_usage.png` file will be generated and stored in the build folder. This file
contains a figure plotting the memory usage over the build time. Learn more by running `lemonade -h`.

## Serving

You can launch a WebSocket server for your LLM with:
Expand Down Expand Up @@ -111,9 +121,9 @@ You can also try Phi-3-Mini-128k-Instruct with the following commands:

`lemonade -i microsoft/Phi-3-mini-4k-instruct oga-load --device igpu --dtype int4 serve`

You can learn more about the CPU and iGPU support in our [OGA documentation](https://github.com/onnx/turnkeyml/blob/main/docs/ort_genai_igpu.md).
You can learn more about the CPU and iGPU support in our [OGA documentation](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/ort_genai_igpu.md).

> Note: early access to AMD's RyzenAI NPU is also available. See the [RyzenAI NPU OGA documentation](https://github.com/onnx/turnkeyml/blob/main/docs/ort_genai_npu.md) for more information.
> Note: early access to AMD's RyzenAI NPU is also available. See the [RyzenAI NPU OGA documentation](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/ort_genai_npu.md) for more information.

## Install RyzenAI NPU for PyTorch

Expand All @@ -131,7 +141,7 @@ If you decide to contribute, please:

- do so via a pull request.
- write your code in keeping with the same style as the rest of this repo's code.
- add a test under `test/llm_api.py` that provides coverage of your new feature.
- add a test under `test/lemonade/llm_api.py` that provides coverage of your new feature.

The best way to contribute is to add new tools to cover more devices and usage scenarios.

Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
39 changes: 33 additions & 6 deletions docs/readme.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,38 @@
# TurnkeyML Documentation

This directory contains documentation for the TurnkeyML project:
## LLMs: `lemonade` tooling

The `docs/lemonade` directory has documentation for the LLM-focused `lemonade` tooling:
- [Getting Started](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/getting_started.md): start here for LLMs.
- Accuracy tests (task performance):
- [HumanEval](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/humaneval_accuracy.md): details of the HumanEval coding task test.
- [MMLU](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/mmlu_accuracy.md): details of the MMLU general reasoning test.
- [Perplexity](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/perplexity.md): details of the Perplexity test for LLMs.
- Tool-specific setup guides:
- [llama.cpp](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/llamacpp.md)
- OnnxRuntime GenaI:
- [iGPU/NPU hybrid](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/ort_genai_hybrid.md)
- [iGPU](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/ort_genai_igpu.md)
- [NPU](https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/ort_genai_npu.md)

## CNNs and Transformers: `turnkey` tooling

The `docs/turnkey` directory contains documentation for the CNN/Transformer-focused `turnkey` tooling:

- [getting_started.md](https://github.com/onnx/turnkeyml/blob/main/docs/turnkey/getting_started.md)
- [tools_user_guide.md](https://github.com/onnx/turnkeyml/blob/main/docs/turnkey/tools_user_guide.md): User guide for the tools: the `turnkey` CLI and the APIs.


There is more useful documentation available in:
- [examples/turnkey/cli/readme.md](https://github.com/onnx/turnkeyml/blob/main/examples/turnkey/cli/readme.md): Tutorial series for learning the `turnkey` CLI.
- [models/readme.md](https://github.com/onnx/turnkeyml/blob/main/models/readme.md): Tutorial for understanding the models and how to use `turnkey` to evaluate the models.

## General Information

This directory also contains documentation for the TurnkeyML project as a whole:

- [code.md](https://github.com/onnx/turnkeyml/blob/main/docs/code.md): Code organization for the tools.
- [install.md](https://github.com/onnx/turnkeyml/blob/main/docs/install.md): Installation instructions for the tools.
- [tools_user_guide.md](https://github.com/onnx/turnkeyml/blob/main/docs/tools_user_guide.md): User guide for the tools: the `turnkey` CLI and the APIs.
- [versioning.md](https://github.com/onnx/turnkeyml/blob/main/docs/versioning.md): Defines the semantic versioning rules for the `turnkey` package.

There is more useful documentation available in:
- [examples/cli/readme.md](https://github.com/onnx/turnkeyml/blob/main/examples/cli/readme.md): Tutorial series for learning the `turnkey` CLI.
- [models/readme.md](https://github.com/onnx/turnkeyml/blob/main/models/readme.md): Tutorial for understanding the models and how to use `turnkey` to evaluate the models.
- [contribute.md](https://github.com/onnx/turnkeyml/blob/main/docs/contribute.md): Contribution guidelines for the project.
- [converage.md](https://github.com/onnx/turnkeyml/blob/main/docs/coverage.md): How to run code coverage metrics.
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,8 @@ The easiest way to learn more about `turnkey` is to explore the help menu with `
We also provide the following resources:

- [Installation guide](https://github.com/onnx/turnkeyml/blob/main/docs/install.md): how to install from source, set up Slurm, etc.
- [User guide](https://github.com/onnx/turnkeyml/blob/main/docs/tools_user_guide.md): explains the concepts of `turnkey's`, including the syntax for making your own tool sequence.
- [Examples](https://github.com/onnx/turnkeyml/tree/main/examples/cli): PyTorch scripts and ONNX files that can be used to try out `turnkey` concepts.
- [User guide](https://github.com/onnx/turnkeyml/blob/main/docs/turnkey/tools_user_guide.md): explains the concepts of `turnkey's`, including the syntax for making your own tool sequence.
- [Examples](https://github.com/onnx/turnkeyml/tree/main/examples/turnkey/cli): PyTorch scripts and ONNX files that can be used to try out `turnkey` concepts.
- [Code organization guide](https://github.com/onnx/turnkeyml/blob/main/docs/code.md): learn how this repository is structured.
- [Models](https://github.com/onnx/turnkeyml/blob/main/models/readme.md): PyTorch model scripts that work with `turnkey`.

Expand Down Expand Up @@ -101,4 +101,4 @@ The build tool has built-in support for a variety of interoperable `Tools`. If y
> turnkey -i my_model.py discover export-pytorch my-custom-tool --my-args
```

All of the built-in `Tools` are implemented against the plugin API. Check out the [example plugins](https://github.com/onnx/turnkeyml/tree/main/examples/cli/plugins) and the [plugin API guide](https://github.com/onnx/turnkeyml/blob/main/docs/contribute.md#contributing-a-plugin) to learn more about creating an installable plugin.
All of the built-in `Tools` are implemented against the plugin API. Check out the [example plugins](https://github.com/onnx/turnkeyml/tree/main/examples/turnkey/cli/plugins) and the [plugin API guide](https://github.com/onnx/turnkeyml/blob/main/docs/contribute.md#contributing-a-plugin) to learn more about creating an installable plugin.
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ Name of one or more script (.py), ONNX (.onnx), or cached build (_state.yaml) fi
Examples:
- `turnkey -i models/selftest/linear.py`
- `turnkey -i models/selftest/linear.py models/selftest/twolayer.py`
- `turnkey -i examples/cli/onnx/sample.onnx`
- `turnkey -i examples/turnkey/cli/onnx/sample.onnx`

You may also use [Bash regular expressions](https://tldp.org/LDP/Bash-Beginners-Guide/html/sect_04_01.html) to locate the files you want to benchmark.

Expand Down
18 changes: 18 additions & 0 deletions examples/lemonade/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Lemonade Examples

This folder contains examples of how to use `lemonade` via the high-level LEAP APIs. These APIs make it easy to load a model, generate responses, and also show how to stream those responses.

The `demos/` folder also contains some higher-level application demos of the LEAP APIs. Learn more in `demos/README.md`.

## LEAP Examples

This table shows which LEAP examples are available:

| Framework | CPU | GPU | NPU | Hybrid |
|----------------------------|---------------------------|------------------|-----------------|--------------------|
| Huggingface | leap_basic.py | - | - | - |
| OGA | leap_oga_cpu.py | leap_oga_igpu.py | leap_oga_npu.py | leap_oga_hybrid.py |
| Huggingface with streaming | leap_streaming.py | - | - | - |
| OGA with streaming | leap_oga_cpu_streaming.py | leap_oga_igpu_streaming.py | leap_oga_npu_streaming.py | leap_oga_hybrid_streaming.py |

To run a LEAP example, first set up a conda environment with the appropriate framework and backend support. Then run the scripts with a command like `python leap_basic.py`.
Loading
Loading