[DOCS] Torch.compile() documentation for 23.1 (#19540)

Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com>
2023-09-04 08:38:29 +02:00
parent e701484571
commit 13e3f9921f
6 changed files with 176 additions and 0 deletions
--- a/docs/Documentation/openvino_workflow.md
+++ b/docs/Documentation/openvino_workflow.md
@@ -17,6 +17,7 @@
   Running Inference <openvino_docs_OV_UG_OV_Runtime_User_Guide>
   Deployment on a Local System  <openvino_deployment_guide>
   Deployment on a Model Server <ovms_what_is_openvino_model_server>
+   pytorch_2_0_torch_compile
   

 | :doc:`Model Preparation <openvino_docs_model_processing_introduction>`
--- a/docs/Documentation/torch_compile.md
+++ b/docs/Documentation/torch_compile.md
@@ -0,0 +1,157 @@
+# PyTorch Deployment via "torch.compile" {#pytorch_2_0_torch_compile}
+
+@sphinxdirective
+
+
+The ``torch.compile`` feature enables you to use OpenVINO for PyTorch-native applications. 
+It speeds up PyTorch code by JIT-compiling it into optimized kernels.
+By default, Torch code runs in eager-mode, but with the use of ``torch.compile`` it goes through the following steps:
+
+1. **Graph acquisition** - the model is rewritten as blocks of subgraphs that are either:
+
+   * compiled by TorchDynamo and "flattened",
+   * falling back to the eager-mode, due to unsupported Python constructs (like control-flow code).
+
+2. **Graph lowering** - all PyTorch operations are decomposed into their constituent kernels specific to the chosen backend.
+3. **Graph compilation** - the kernels call their corresponding low-level device-specific operations.
+
+
+
+How to Use
+#################
+
+To use ``torch.compile``, you need to add an import statement and define one of the two available backends:
+
+| ``openvino``
+|   With this backend, Torch FX subgraphs are directly converted to OpenVINO representation without any additional PyTorch based tracing/scripting.
+
+| ``openvino_ts``
+|   With this backend, Torch FX subgraphs are first traced/scripted with PyTorch Torchscript, and then converted to OpenVINO representation.
+
+
+.. tab-set::
+
+   .. tab-item:: openvino
+      :sync: backend-openvino
+
+      .. code-block:: console
+
+         import openvino.torch 
+         ...
+         model = torch.compile(model, backend='openvino')
+
+      Execution diagram:
+
+      .. image:: _static/images/torch_compile_backend_openvino.svg
+         :width: 992px
+         :height: 720px
+         :scale: 60%
+         :align: center
+
+   .. tab-item:: openvino_ts
+      :sync: backend-openvino-ts
+
+      .. code-block:: console
+
+         import openvino.torch
+         ...
+         model = torch.compile(model, backend='openvino_ts')
+
+      Execution diagram:
+
+      .. image:: _static/images/torch_compile_backend_openvino_ts.svg
+         :width: 1088px
+         :height: 720px
+         :scale: 60%
+         :align: center
+
+
+Environment Variables
+++++++++++++++++++++++++++
+
+* **OPENVINO_TORCH_BACKEND_DEVICE**: enables selecting a specific hardware device to run the application. 
+  By default, the OpenVINO backend for ``torch.compile`` runs PyTorch applications using the CPU. Setting 
+  this variable to GPU.0, for example, will make the application use the integrated graphics processor instead.
+* **OPENVINO_TORCH_MODEL_CACHING**: enables saving the optimized model files to a hard drive, after the first application run.
+  This makes them available for the following application executions, reducing the first-inference latency.
+  By default, this variable is set to ``False``. Setting it to ``True`` enables caching.
+* **OPENVINO_TORCH_CACHE_DIR**: enables defining a custom directory for the model files (if model caching set to ``True``).
+  By default, the OpenVINO IR is saved in the ``cache`` sub-directory, created in the application's root directory. 
+
+Windows support
++++++++++++++++++++++++++
+
+Currently, PyTorch does not support ``torch.compile`` feature on Windows officially. However, it can be accessed by running
+the below instructions:
+
+1. Install the PyTorch nightly wheel file - `2.1.0.dev20230713 <https://download.pytorch.org/whl/nightly/cpu/torch-2.1.0.dev20230713%2Bcpu-cp38-cp38-win_amd64.whl>`__ ,
+2. Update the file at ``<python_env_root>/Lib/site-packages/torch/_dynamo/eval_frames.py``
+3. Find the function called ``check_if_dynamo_supported()``:
+
+   .. code-block:: console
+
+      def check_if_dynamo_supported():
+          if sys.platform == "win32":
+              raise RuntimeError("Windows not yet supported for torch.compile")
+          if sys.version_info >= (3, 11):
+              raise RuntimeError("Python 3.11+ not yet supported for torch.compile")
+
+4. Put in comments the first two lines in this function, so it looks like this:
+
+   .. code-block:: console
+
+      def check_if_dynamo_supported():
+       #if sys.platform == "win32":
+       #    raise RuntimeError("Windows not yet supported for torch.compile")
+       if sys.version_info >= (3, 11):
+           `raise RuntimeError("Python 3.11+ not yet supported for torch.compile")
+
+
+Support for Automatic1111 Stable Diffusion WebUI
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+Automatic1111 Stable Diffusion WebUI is an open-source repository that hosts a browser-based interface for the Stable Diffusion 
+based image generation. It allows users to create realistic and creative images from text prompts. 
+Stable Diffusion WebUI is supported on Intel CPUs, Intel integrated GPUs, and Intel discrete GPUs by leveraging OpenVINO 
+``torch.compile`` capability. Detailed instructions are available in 
+`Stable Diffusion WebUI repository. <https://github.com/openvinotoolkit/stable-diffusion-webui/wiki/Installation-on-Intel-Silicon>`__
+
+
+Architecture
+#################
+
+The ``torch.compile`` feature is part of PyTorch 2.0, and is based on:
+
+* **TorchDynamo** - a Python-level JIT that hooks into the frame evaluation API in CPython,
+  (PEP 523) to dynamically modify Python bytecode right before it is executed (PyTorch operators 
+  that cannot be extracted to FX graph are executed in the native Python environment). 
+  It maintains the eager-mode capabilities using 
+  `Guards <https://pytorch.org/docs/stable/dynamo/guards-overview.html>`__ to ensure the 
+  generated graphs are valid.
+
+* **AOTAutograd** - generates the backward graph corresponding to the forward graph captured by TorchDynamo.
+* **PrimTorch** - decomposes complicated PyTorch operations into simpler and more elementary ops.
+* **TorchInductor** - a deep learning compiler that generates fast code for multiple accelerators and backends.
+
+
+
+
+When the PyTorch module is wrapped with ``torch.compile``, TorchDynamo traces the module and 
+rewrites Python bytecode to extract sequences of PyTorch operations into an FX Graph,
+which can be optimized by the OpenVINO backend. The Torch FX graphs are first converted to 
+inlined FX graphs and the graph partitioning module traverses inlined FX graph to identify 
+operators supported by OpenVINO. 
+
+All the supported operators are clustered into OpenVINO submodules, converted to the OpenVINO 
+graph using OpenVINO's PyTorch decoder, and executed in an optimized manner using OpenVINO runtime. 
+All unsupported operators fall back to the native PyTorch runtime on CPU. If the subgraph 
+fails during OpenVINO conversion, the subgraph falls back to PyTorch's default inductor backend.
+
+
+
+Additional Resources
+############################
+
+* `PyTorch 2.0 documentation <https://pytorch.org/docs/stable/index.html>`_
+
+@endsphinxdirective
--- a/docs/_static/css/homepage_style.css
+++ b/docs/_static/css/homepage_style.css
@@ -167,3 +167,7 @@ h1 {
        max-width: 100%;
    }
 }
+
+.sd-row {
+    --sd-gutter-x: 0rem!important;
+}
--- a/docs/_static/images/torch_compile_backend_openvino.svg
+++ b/docs/_static/images/torch_compile_backend_openvino.svg
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0e26fe889ada0e02a3bbc03e451a7e1d4b06037723349971efff1d721b5e13f6
+size 117253
--- a/docs/_static/images/torch_compile_backend_openvino_ts.svg
+++ b/docs/_static/images/torch_compile_backend_openvino_ts.svg
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7c5ba73be918d2105b54e2afe36aa99e8fc5313489afc08f1a92ff75580667ba
+size 173373
--- a/docs/home.rst
+++ b/docs/home.rst
@@ -26,6 +26,7 @@ OpenVINO 2023.0
 			         <li class="splide__slide">Even more integrations in 2023.0!<br>Load TensorFlow, TensorFlow Lite, and PyTorch models directly, without manual conversion.<br><a href="https://docs.openvino.ai/2023.0/Supported_Model_Formats.html">See the supported model formats...</a></li>
         			<li class="splide__slide">CPU inference has become even better. ARM processors are supported and thread scheduling is available on 12th gen Intel® Core and up.<br><a href="https://docs.openvino.ai/2023.0/openvino_docs_OV_UG_OV_Runtime_User_Guide.html">See how to run OpenVINO on various devices...</a></li>
         			<li class="splide__slide">Post-training optimization and quantization-aware training now in one tool!<br><a href="https://docs.openvino.ai/2023.0/openvino_docs_model_optimization_guide.html">See the new NNCF capabilities...</a></li>
+                  <li class="splide__slide">OpenVINO is enabled in the PyTorch 2.0 torch.compile() backend.<br><a href="https://docs.openvino.ai/2023.0/pytorch_2_0_torch_compile.html">See how it works...</a></li>
         		</ul>
           </div>
         </section>
@@ -83,6 +84,13 @@ OpenVINO 2023.0

      Reach for performance with post-training and training-time compression with NNCF

+   .. grid-item-card:: PyTorch 2.0 - torch.compile() backend
+      :link: pytorch_2_0_torch_compile
+      :link-alt: torch.compile 
+      :link-type: doc
+
+      Optimize generation of the graph model with PyTorch 2.0 torch.compile() backend
+

 Feature Overview
 ##############################