[DOCS] Torch.compile() documentation for 23.1 (#19540)
Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com>
This commit is contained in:
@@ -17,6 +17,7 @@
|
||||
Running Inference <openvino_docs_OV_UG_OV_Runtime_User_Guide>
|
||||
Deployment on a Local System <openvino_deployment_guide>
|
||||
Deployment on a Model Server <ovms_what_is_openvino_model_server>
|
||||
pytorch_2_0_torch_compile
|
||||
|
||||
|
||||
| :doc:`Model Preparation <openvino_docs_model_processing_introduction>`
|
||||
|
||||
157
docs/Documentation/torch_compile.md
Normal file
157
docs/Documentation/torch_compile.md
Normal file
@@ -0,0 +1,157 @@
|
||||
# PyTorch Deployment via "torch.compile" {#pytorch_2_0_torch_compile}
|
||||
|
||||
@sphinxdirective
|
||||
|
||||
|
||||
The ``torch.compile`` feature enables you to use OpenVINO for PyTorch-native applications.
|
||||
It speeds up PyTorch code by JIT-compiling it into optimized kernels.
|
||||
By default, Torch code runs in eager-mode, but with the use of ``torch.compile`` it goes through the following steps:
|
||||
|
||||
1. **Graph acquisition** - the model is rewritten as blocks of subgraphs that are either:
|
||||
|
||||
* compiled by TorchDynamo and "flattened",
|
||||
* falling back to the eager-mode, due to unsupported Python constructs (like control-flow code).
|
||||
|
||||
2. **Graph lowering** - all PyTorch operations are decomposed into their constituent kernels specific to the chosen backend.
|
||||
3. **Graph compilation** - the kernels call their corresponding low-level device-specific operations.
|
||||
|
||||
|
||||
|
||||
How to Use
|
||||
#################
|
||||
|
||||
To use ``torch.compile``, you need to add an import statement and define one of the two available backends:
|
||||
|
||||
| ``openvino``
|
||||
| With this backend, Torch FX subgraphs are directly converted to OpenVINO representation without any additional PyTorch based tracing/scripting.
|
||||
|
||||
| ``openvino_ts``
|
||||
| With this backend, Torch FX subgraphs are first traced/scripted with PyTorch Torchscript, and then converted to OpenVINO representation.
|
||||
|
||||
|
||||
.. tab-set::
|
||||
|
||||
.. tab-item:: openvino
|
||||
:sync: backend-openvino
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
import openvino.torch
|
||||
...
|
||||
model = torch.compile(model, backend='openvino')
|
||||
|
||||
Execution diagram:
|
||||
|
||||
.. image:: _static/images/torch_compile_backend_openvino.svg
|
||||
:width: 992px
|
||||
:height: 720px
|
||||
:scale: 60%
|
||||
:align: center
|
||||
|
||||
.. tab-item:: openvino_ts
|
||||
:sync: backend-openvino-ts
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
import openvino.torch
|
||||
...
|
||||
model = torch.compile(model, backend='openvino_ts')
|
||||
|
||||
Execution diagram:
|
||||
|
||||
.. image:: _static/images/torch_compile_backend_openvino_ts.svg
|
||||
:width: 1088px
|
||||
:height: 720px
|
||||
:scale: 60%
|
||||
:align: center
|
||||
|
||||
|
||||
Environment Variables
|
||||
+++++++++++++++++++++++++++
|
||||
|
||||
* **OPENVINO_TORCH_BACKEND_DEVICE**: enables selecting a specific hardware device to run the application.
|
||||
By default, the OpenVINO backend for ``torch.compile`` runs PyTorch applications using the CPU. Setting
|
||||
this variable to GPU.0, for example, will make the application use the integrated graphics processor instead.
|
||||
* **OPENVINO_TORCH_MODEL_CACHING**: enables saving the optimized model files to a hard drive, after the first application run.
|
||||
This makes them available for the following application executions, reducing the first-inference latency.
|
||||
By default, this variable is set to ``False``. Setting it to ``True`` enables caching.
|
||||
* **OPENVINO_TORCH_CACHE_DIR**: enables defining a custom directory for the model files (if model caching set to ``True``).
|
||||
By default, the OpenVINO IR is saved in the ``cache`` sub-directory, created in the application's root directory.
|
||||
|
||||
Windows support
|
||||
++++++++++++++++++++++++++
|
||||
|
||||
Currently, PyTorch does not support ``torch.compile`` feature on Windows officially. However, it can be accessed by running
|
||||
the below instructions:
|
||||
|
||||
1. Install the PyTorch nightly wheel file - `2.1.0.dev20230713 <https://download.pytorch.org/whl/nightly/cpu/torch-2.1.0.dev20230713%2Bcpu-cp38-cp38-win_amd64.whl>`__ ,
|
||||
2. Update the file at ``<python_env_root>/Lib/site-packages/torch/_dynamo/eval_frames.py``
|
||||
3. Find the function called ``check_if_dynamo_supported()``:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
def check_if_dynamo_supported():
|
||||
if sys.platform == "win32":
|
||||
raise RuntimeError("Windows not yet supported for torch.compile")
|
||||
if sys.version_info >= (3, 11):
|
||||
raise RuntimeError("Python 3.11+ not yet supported for torch.compile")
|
||||
|
||||
4. Put in comments the first two lines in this function, so it looks like this:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
def check_if_dynamo_supported():
|
||||
#if sys.platform == "win32":
|
||||
# raise RuntimeError("Windows not yet supported for torch.compile")
|
||||
if sys.version_info >= (3, 11):
|
||||
`raise RuntimeError("Python 3.11+ not yet supported for torch.compile")
|
||||
|
||||
|
||||
Support for Automatic1111 Stable Diffusion WebUI
|
||||
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
||||
|
||||
Automatic1111 Stable Diffusion WebUI is an open-source repository that hosts a browser-based interface for the Stable Diffusion
|
||||
based image generation. It allows users to create realistic and creative images from text prompts.
|
||||
Stable Diffusion WebUI is supported on Intel CPUs, Intel integrated GPUs, and Intel discrete GPUs by leveraging OpenVINO
|
||||
``torch.compile`` capability. Detailed instructions are available in
|
||||
`Stable Diffusion WebUI repository. <https://github.com/openvinotoolkit/stable-diffusion-webui/wiki/Installation-on-Intel-Silicon>`__
|
||||
|
||||
|
||||
Architecture
|
||||
#################
|
||||
|
||||
The ``torch.compile`` feature is part of PyTorch 2.0, and is based on:
|
||||
|
||||
* **TorchDynamo** - a Python-level JIT that hooks into the frame evaluation API in CPython,
|
||||
(PEP 523) to dynamically modify Python bytecode right before it is executed (PyTorch operators
|
||||
that cannot be extracted to FX graph are executed in the native Python environment).
|
||||
It maintains the eager-mode capabilities using
|
||||
`Guards <https://pytorch.org/docs/stable/dynamo/guards-overview.html>`__ to ensure the
|
||||
generated graphs are valid.
|
||||
|
||||
* **AOTAutograd** - generates the backward graph corresponding to the forward graph captured by TorchDynamo.
|
||||
* **PrimTorch** - decomposes complicated PyTorch operations into simpler and more elementary ops.
|
||||
* **TorchInductor** - a deep learning compiler that generates fast code for multiple accelerators and backends.
|
||||
|
||||
|
||||
|
||||
|
||||
When the PyTorch module is wrapped with ``torch.compile``, TorchDynamo traces the module and
|
||||
rewrites Python bytecode to extract sequences of PyTorch operations into an FX Graph,
|
||||
which can be optimized by the OpenVINO backend. The Torch FX graphs are first converted to
|
||||
inlined FX graphs and the graph partitioning module traverses inlined FX graph to identify
|
||||
operators supported by OpenVINO.
|
||||
|
||||
All the supported operators are clustered into OpenVINO submodules, converted to the OpenVINO
|
||||
graph using OpenVINO's PyTorch decoder, and executed in an optimized manner using OpenVINO runtime.
|
||||
All unsupported operators fall back to the native PyTorch runtime on CPU. If the subgraph
|
||||
fails during OpenVINO conversion, the subgraph falls back to PyTorch's default inductor backend.
|
||||
|
||||
|
||||
|
||||
Additional Resources
|
||||
############################
|
||||
|
||||
* `PyTorch 2.0 documentation <https://pytorch.org/docs/stable/index.html>`_
|
||||
|
||||
@endsphinxdirective
|
||||
4
docs/_static/css/homepage_style.css
vendored
4
docs/_static/css/homepage_style.css
vendored
@@ -167,3 +167,7 @@ h1 {
|
||||
max-width: 100%;
|
||||
}
|
||||
}
|
||||
|
||||
.sd-row {
|
||||
--sd-gutter-x: 0rem!important;
|
||||
}
|
||||
|
||||
3
docs/_static/images/torch_compile_backend_openvino.svg
vendored
Normal file
3
docs/_static/images/torch_compile_backend_openvino.svg
vendored
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:0e26fe889ada0e02a3bbc03e451a7e1d4b06037723349971efff1d721b5e13f6
|
||||
size 117253
|
||||
3
docs/_static/images/torch_compile_backend_openvino_ts.svg
vendored
Normal file
3
docs/_static/images/torch_compile_backend_openvino_ts.svg
vendored
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:7c5ba73be918d2105b54e2afe36aa99e8fc5313489afc08f1a92ff75580667ba
|
||||
size 173373
|
||||
@@ -26,6 +26,7 @@ OpenVINO 2023.0
|
||||
<li class="splide__slide">Even more integrations in 2023.0!<br>Load TensorFlow, TensorFlow Lite, and PyTorch models directly, without manual conversion.<br><a href="https://docs.openvino.ai/2023.0/Supported_Model_Formats.html">See the supported model formats...</a></li>
|
||||
<li class="splide__slide">CPU inference has become even better. ARM processors are supported and thread scheduling is available on 12th gen Intel® Core and up.<br><a href="https://docs.openvino.ai/2023.0/openvino_docs_OV_UG_OV_Runtime_User_Guide.html">See how to run OpenVINO on various devices...</a></li>
|
||||
<li class="splide__slide">Post-training optimization and quantization-aware training now in one tool!<br><a href="https://docs.openvino.ai/2023.0/openvino_docs_model_optimization_guide.html">See the new NNCF capabilities...</a></li>
|
||||
<li class="splide__slide">OpenVINO is enabled in the PyTorch 2.0 torch.compile() backend.<br><a href="https://docs.openvino.ai/2023.0/pytorch_2_0_torch_compile.html">See how it works...</a></li>
|
||||
</ul>
|
||||
</div>
|
||||
</section>
|
||||
@@ -83,6 +84,13 @@ OpenVINO 2023.0
|
||||
|
||||
Reach for performance with post-training and training-time compression with NNCF
|
||||
|
||||
.. grid-item-card:: PyTorch 2.0 - torch.compile() backend
|
||||
:link: pytorch_2_0_torch_compile
|
||||
:link-alt: torch.compile
|
||||
:link-type: doc
|
||||
|
||||
Optimize generation of the graph model with PyTorch 2.0 torch.compile() backend
|
||||
|
||||
|
||||
Feature Overview
|
||||
##############################
|
||||
|
||||
Reference in New Issue
Block a user