From f2e1e6c7d20068f6b21edbee4bdb4218d715dc75 Mon Sep 17 00:00:00 2001 From: lanluo-nvidia Date: Wed, 16 Oct 2024 14:23:01 -0700 Subject: [PATCH] cherry pick doc fix #3238 from main to release2.5 (#3240) Co-authored-by: Dheeraj Peri --- docsrc/index.rst | 48 ++++++++++++++----- docsrc/tutorials/notebooks.rst | 5 +- examples/README.rst | 5 +- examples/dynamo/README.rst | 29 ++++++----- .../dynamo/torch_compile_resnet_example.py | 2 +- .../dynamo/torch_compile_stable_diffusion.py | 2 +- .../torch_compile_transformers_example.py | 4 +- examples/dynamo/torch_export_gpt2.py | 13 ++--- examples/dynamo/torch_export_llama2.py | 14 +++--- 9 files changed, 75 insertions(+), 47 deletions(-) diff --git a/docsrc/index.rst b/docsrc/index.rst index 757acc2011..7a91e763ad 100644 --- a/docsrc/index.rst +++ b/docsrc/index.rst @@ -48,10 +48,35 @@ User Guide user_guide/saving_models user_guide/runtime user_guide/using_dla + + +Tutorials +------------ + +* :ref:`torch_compile_advanced_usage` +* :ref:`vgg16_ptq` +* :ref:`engine_caching_example` +* :ref:`engine_caching_bert_example` +* :ref:`refit_engine_example` +* :ref:`serving_torch_tensorrt_with_triton` +* :ref:`torch_export_cudagraphs` +* :ref:`custom_kernel_plugins` +* :ref:`mutable_torchtrt_module_example` + +.. toctree:: + :caption: Tutorials + :maxdepth: 1 + :hidden: + tutorials/_rendered_examples/dynamo/torch_compile_advanced_usage tutorials/_rendered_examples/dynamo/vgg16_ptq tutorials/_rendered_examples/dynamo/engine_caching_example + tutorials/_rendered_examples/dynamo/engine_caching_bert_example tutorials/_rendered_examples/dynamo/refit_engine_example + tutorials/serving_torch_tensorrt_with_triton + tutorials/_rendered_examples/dynamo/torch_export_cudagraphs + tutorials/_rendered_examples/dynamo/custom_kernel_plugins + tutorials/_rendered_examples/dynamo/mutable_torchtrt_module_example Dynamo Frontend ---------------- @@ -97,27 +122,28 @@ FX Frontend fx/getting_started_with_fx_path -Tutorials +Model Zoo ------------ -* :ref:`torch_tensorrt_tutorials` -* :ref:`serving_torch_tensorrt_with_triton` +* :ref:`torch_compile_resnet` +* :ref:`torch_compile_transformer` +* :ref:`torch_compile_stable_diffusion` +* :ref:`torch_export_gpt2` +* :ref:`torch_export_llama2` * :ref:`notebooks` .. toctree:: - :caption: Tutorials + :caption: Model Zoo :maxdepth: 3 :hidden: - - tutorials/serving_torch_tensorrt_with_triton - tutorials/notebooks + tutorials/_rendered_examples/dynamo/torch_compile_resnet_example tutorials/_rendered_examples/dynamo/torch_compile_transformers_example tutorials/_rendered_examples/dynamo/torch_compile_stable_diffusion - tutorials/_rendered_examples/dynamo/torch_export_cudagraphs - tutorials/_rendered_examples/dynamo/custom_kernel_plugins tutorials/_rendered_examples/distributed_inference/data_parallel_gpt2 tutorials/_rendered_examples/distributed_inference/data_parallel_stable_diffusion - tutorials/_rendered_examples/dynamo/mutable_torchtrt_module_example + tutorials/_rendered_examples/dynamo/torch_export_gpt2 + tutorials/_rendered_examples/dynamo/torch_export_llama2 + tutorials/notebooks Python API Documentation ------------------------ @@ -214,4 +240,4 @@ Legacy Further Information (TorchScript) * `GTC 2021 Fall Talk `_ * `PyTorch Ecosystem Day 2021 `_ * `PyTorch Developer Conference 2021 `_ -* `PyTorch Developer Conference 2022 `_ +* `PyTorch Developer Conference 2022 `_ \ No newline at end of file diff --git a/docsrc/tutorials/notebooks.rst b/docsrc/tutorials/notebooks.rst index 14737a8f63..509676d83a 100644 --- a/docsrc/tutorials/notebooks.rst +++ b/docsrc/tutorials/notebooks.rst @@ -1,10 +1,9 @@ .. _notebooks: -Example notebooks +Legacy notebooks =================== -There exists a number of notebooks which cover specific using specific features and models -with Torch-TensorRT +There exists a number of notebooks which demonstrate different model conversions / features / frontends available within Torch-TensorRT Notebooks ------------ diff --git a/examples/README.rst b/examples/README.rst index 7c21aad732..be67c27e61 100644 --- a/examples/README.rst +++ b/examples/README.rst @@ -1,7 +1,4 @@ .. _torch_tensorrt_tutorials: Torch-TensorRT Tutorials -=========================== - -The user guide covers the basic concepts and usage of Torch-TensorRT. -We also provide a number of tutorials to explore specific usecases and advanced concepts +=========================== \ No newline at end of file diff --git a/examples/dynamo/README.rst b/examples/dynamo/README.rst index ff3563cffe..82375c8995 100644 --- a/examples/dynamo/README.rst +++ b/examples/dynamo/README.rst @@ -1,19 +1,22 @@ -.. _torch_compile: +.. _torch_tensorrt_examples: -Dynamo / ``torch.compile`` ----------------------------- +Here we provide examples of Torch-TensorRT compilation of popular computer vision and language models. -Torch-TensorRT provides a backend for the new ``torch.compile`` API released in PyTorch 2.0. In the following examples we describe -a number of ways you can leverage this backend to accelerate inference. +Dependencies +------------------------------------ +Please install the following external dependencies (assuming you already have correct `torch`, `torch_tensorrt` and `tensorrt` libraries installed (`dependencies `_)) + +.. code-block:: python + + pip install -r requirements.txt + + +Model Zoo +------------------------------------ * :ref:`torch_compile_resnet`: Compiling a ResNet model using the Torch Compile Frontend for ``torch_tensorrt.compile`` * :ref:`torch_compile_transformer`: Compiling a Transformer model using ``torch.compile`` -* :ref:`torch_compile_advanced_usage`: Advanced usage including making a custom backend to use directly with the ``torch.compile`` API * :ref:`torch_compile_stable_diffusion`: Compiling a Stable Diffusion model using ``torch.compile`` -* :ref:`torch_export_cudagraphs`: Using the Cudagraphs integration with `ir="dynamo"` -* :ref:`custom_kernel_plugins`: Creating a plugin to use a custom kernel inside TensorRT engines -* :ref:`refit_engine_example`: Refitting a compiled TensorRT Graph Module with updated weights -* :ref:`mutable_torchtrt_module_example`: Compile, use, and modify TensorRT Graph Module with MutableTorchTensorRTModule -* :ref:`vgg16_fp8_ptq`: Compiling a VGG16 model with FP8 and PTQ using ``torch.compile`` -* :ref:`engine_caching_example`: Utilizing engine caching to speed up compilation times -* :ref:`engine_caching_bert_example`: Demonstrating engine caching on BERT +* :ref:`_torch_export_gpt2`: Compiling a GPT2 model using AOT workflow (`ir=dynamo`) +* :ref:`_torch_export_llama2`: Compiling a Llama2 model using AOT workflow (`ir=dynamo`) + diff --git a/examples/dynamo/torch_compile_resnet_example.py b/examples/dynamo/torch_compile_resnet_example.py index 420c5390d3..f852d60158 100644 --- a/examples/dynamo/torch_compile_resnet_example.py +++ b/examples/dynamo/torch_compile_resnet_example.py @@ -1,7 +1,7 @@ """ .. _torch_compile_resnet: -Compiling ResNet using the Torch-TensorRT `torch.compile` Backend +Compiling ResNet with dynamic shapes using the `torch.compile` backend ========================================================== This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a ResNet model.""" diff --git a/examples/dynamo/torch_compile_stable_diffusion.py b/examples/dynamo/torch_compile_stable_diffusion.py index a0b725572b..fe49da74d1 100644 --- a/examples/dynamo/torch_compile_stable_diffusion.py +++ b/examples/dynamo/torch_compile_stable_diffusion.py @@ -1,7 +1,7 @@ """ .. _torch_compile_stable_diffusion: -Torch Compile Stable Diffusion +Compiling Stable Diffusion model using the `torch.compile` backend ====================================================== This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a Stable Diffusion model. A sample output is featured below: diff --git a/examples/dynamo/torch_compile_transformers_example.py b/examples/dynamo/torch_compile_transformers_example.py index 01d46e96f6..221ecd4fd1 100644 --- a/examples/dynamo/torch_compile_transformers_example.py +++ b/examples/dynamo/torch_compile_transformers_example.py @@ -1,10 +1,10 @@ """ .. _torch_compile_transformer: -Compiling a Transformer using torch.compile and TensorRT +Compiling BERT using the `torch.compile` backend ============================================================== -This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a transformer-based model.""" +This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a BERT model.""" # %% # Imports and Model Definition diff --git a/examples/dynamo/torch_export_gpt2.py b/examples/dynamo/torch_export_gpt2.py index a26305e4a3..9be69d3830 100644 --- a/examples/dynamo/torch_export_gpt2.py +++ b/examples/dynamo/torch_export_gpt2.py @@ -1,10 +1,10 @@ """ .. _torch_export_gpt2: -Compiling GPT2 using the Torch-TensorRT with dynamo backend +Compiling GPT2 using the dynamo backend ========================================================== -This interactive script is intended as a sample of the Torch-TensorRT workflow with dynamo backend on a GPT2 model.""" +This script illustrates Torch-TensorRT workflow with dynamo backend on popular GPT2 model.""" # %% # Imports and Model Definition @@ -78,9 +78,10 @@ tokenizer.decode(trt_gen_tokens[0], skip_special_tokens=True), ) -# %% -# The output sentences should look like +# Prompt : What is parallel programming ? + # ============================= -# Pytorch model generated text: I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with my dog. I'm not sure if I'll ever be able to walk with my +# Pytorch model generated text: The parallel programming paradigm is a set of programming languages that are designed to be used in parallel. The main difference between parallel programming and parallel programming is that + # ============================= -# TensorRT model generated text: I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with my dog. I'm not sure if I'll ever be able to walk with my +# TensorRT model generated text: The parallel programming paradigm is a set of programming languages that are designed to be used in parallel. The main difference between parallel programming and parallel programming is that diff --git a/examples/dynamo/torch_export_llama2.py b/examples/dynamo/torch_export_llama2.py index 195944688b..a24940a353 100644 --- a/examples/dynamo/torch_export_llama2.py +++ b/examples/dynamo/torch_export_llama2.py @@ -1,10 +1,10 @@ """ .. _torch_export_llama2: -Compiling Llama2 using the Torch-TensorRT with dynamo backend +Compiling Llama2 using the dynamo backend ========================================================== -This interactive script is intended as a sample of the Torch-TensorRT workflow with dynamo backend on a Llama2 model.""" +This script illustrates Torch-TensorRT workflow with dynamo backend on popular Llama2 model.""" # %% # Imports and Model Definition @@ -82,9 +82,11 @@ )[0], ) -# %% -# The output sentences should look like + +# Prompt : What is dynamic programming? + # ============================= -# Pytorch model generated text: I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with my dog. I'm not sure if I'll ever be able to walk with my +# Pytorch model generated text: Dynamic programming is an algorithmic technique used to solve complex problems by breaking them down into smaller subproblems, solving each subproblem only once, and + # ============================= -# TensorRT model generated text: I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with my dog. I'm not sure if I'll ever be able to walk with my +# TensorRT model generated text: Dynamic programming is an algorithmic technique used to solve complex problems by breaking them down into smaller subproblems, solving each subproblem only once, and