diff --git a/cm-mlops/script/app-loadgen-generic-python/README-extra.md b/cm-mlops/script/app-loadgen-generic-python/README-extra.md index 9e7779abfc..414cef42b1 100644 --- a/cm-mlops/script/app-loadgen-generic-python/README-extra.md +++ b/cm-mlops/script/app-loadgen-generic-python/README-extra.md @@ -219,11 +219,11 @@ These cases worked on Windows and Linux but may require GPU with > 8GB memory: cmr "python app loadgen-generic _onnxruntime _cuda _custom _huggingface _model-stub.steerapi/Llama-2-7b-chat-hf-onnx-awq-w8" --adr.hf-downloader.model_filename=onnx/decoder_model_merged_quantized.onnx,onnx/decoder_model_merged_quantized.onnx_data --samples=2 cmr "python app loadgen-generic _onnxruntime _cuda _custom _huggingface _model-stub.alpindale/Llama-2-13b-ONNX" --adr.hf-downloader.model_filename=FP32/LlamaV2_13B_float32.onnx --adr.hf-downloader.full_subfolder=FP32 --samples=2 cmr "python app loadgen-generic _onnxruntime _cuda _custom _huggingface _model-stub.Intel/gpt-j-6B-int8-static" --adr.hf-downloader.model_filename=model.onnx --adr.hf-downloader.full_subfolder=. --samples=2 -cmr "python app loadgen-generic _onnxruntime _cuda _custom _huggingface _model-stub.microsoft/Mistral-7B-v0.1-onnx" --adr.hf-downloader.model_filename=Mistral-7B-v0.1.onnx,Mistral-7B-v0.1.onnx.data --samples=2 ``` TBD: some cases that are not yet fully supported (data types, etc): ```bash +cmr "python app loadgen-generic _onnxruntime _cuda _custom _huggingface _model-stub.microsoft/Mistral-7B-v0.1-onnx" --adr.hf-downloader.model_filename=Mistral-7B-v0.1.onnx,Mistral-7B-v0.1.onnx.data --samples=2 cmr "python app loadgen-generic _onnxruntime _cuda _custom _huggingface _model-stub.alpindale/Llama-2-7b-ONNX" --adr.hf-downloader.model_filename=FP16/LlamaV2_7B_float16.onnx --adr.hf-downloader.full_subfolder=FP16 --samples=2 ```