feat: Runtime output buffer optimization #3276

keehyuna · 2024-11-04T13:14:19Z

Description

Latency hiding by creating the output tensor for next output buffer

Fixes #3275

Type of change

Please delete options that are not relevant and/or add your own.

New feature (non-breaking change which adds functionality)

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

narendasan · 2024-11-15T18:53:19Z

@keehyuna I believe that @peri044 has some shape inference for TRT engines utility in one of his PRs can we make sure we use that or are compatible with it?

…tensor

keehyuna · 2024-11-18T12:51:46Z

@keehyuna I believe that @peri044 has some shape inference for TRT engines utility in one of his PRs can we make sure we use that or are compatible with it?

I think this PR doesn't have to do with fake tensor as output shape was inferred from trt function(getTensorShape/get_tensor_shape) for both c++ and python runtime.

peri044 · 2024-11-18T23:17:14Z

core/runtime/execute_engine.cpp

@@ -263,19 +284,15 @@ std::vector<at::Tensor> execute_engine(std::vector<at::Tensor> inputs, c10::intr
      output_profiler_guard =
          std::make_unique<torch::autograd::profiler::RecordProfile>(compiled_engine->output_profile_path);
    }
+    if ((false == compiled_engine->use_pre_allocated_outputs) || shape_changed) {


!compiled_engine->use_pre_allocated_outputs ?

peri044 · 2024-11-18T23:20:19Z

core/runtime/execute_engine.cpp

+  return false;
+}
+
+std::vector<at::Tensor> create_output_tensors(c10::intrusive_ptr<TRTEngine> compiled_engine) {


Can we functionalize inputs allocation/creation in the execute engine similar to this ? ( I posted a similar comment in your wrapper module PR)

narendasan

Create a context manager to enable this across subgraphs

facebook-github-bot added the cla signed label Nov 4, 2024

github-actions bot added component: core Issues re: The core compiler component: api [Python] Issues re: Python API component: runtime component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Nov 4, 2024

github-actions bot requested a review from peri044 November 4, 2024 13:14

keehyuna mentioned this pull request Nov 4, 2024

📖 [Story] Optimize the launch overhead of TRT engine and pytorch kernels #3274

Open

keehyuna self-assigned this Nov 4, 2024

keehyuna force-pushed the opt_out_buffer branch from d0ef3cd to 377248e Compare November 14, 2024 06:44

keehyuna marked this pull request as ready for review November 14, 2024 14:27

keehyuna requested a review from narendasan November 14, 2024 14:28

keehyuna added 4 commits November 18, 2024 21:15

feat: Runtime output buffer optimization

b20830b

chore: setting for test

998c0c6

chore: Initialize shape key as non-empty string to validate no input …

210ae8b

…tensor

chore: rebase and rename variable

4a5f0d1

keehyuna force-pushed the opt_out_buffer branch from 0a98180 to 4a5f0d1 Compare November 18, 2024 12:42

peri044 reviewed Nov 18, 2024

View reviewed changes

narendasan reviewed Nov 22, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Runtime output buffer optimization #3276

feat: Runtime output buffer optimization #3276

keehyuna commented Nov 4, 2024

narendasan commented Nov 15, 2024

keehyuna commented Nov 18, 2024

peri044 Nov 18, 2024

peri044 Nov 18, 2024

narendasan left a comment

feat: Runtime output buffer optimization #3276

Are you sure you want to change the base?

feat: Runtime output buffer optimization #3276

Conversation

keehyuna commented Nov 4, 2024

Description

Type of change

Checklist:

narendasan commented Nov 15, 2024

keehyuna commented Nov 18, 2024

peri044 Nov 18, 2024

Choose a reason for hiding this comment

peri044 Nov 18, 2024

Choose a reason for hiding this comment

narendasan left a comment

Choose a reason for hiding this comment