* add sphinx log parsing * fix * fix log * fixes * fixes * fixes * fixes * fixes * fixes * fixes * fixes * fixes * fixes * doxygen-xfail * fixes * fixes * fixes * fixe * fixes * fixes * fix pot * add pot check * fixes * fixes * Fixed POT docs * Fixed POT docs * Fixes * change heading markup * fixes Co-authored-by: azaytsev <andrey.zaytsev@intel.com>
10 KiB
Post-training Optimization Tool Frequently Asked Questions
If your question is not covered below, use the OpenVINO™ Community Forum page, where you can participate freely.
- Is the Post-training Optimization Tool opensourced?
- Can I quantize my model without a dataset?
- Can a model in any framework be quantized by the POT?
- What is a tradeoff when you go to low precision?
- I'd like to quantize a model and I've converted it to IR but I don't have the Accuracy Checker config. What can I do?
- I tried all recommendations from "Post-Training Optimization Best Practices" but either have a high accuracy drop or bad performance after quantization. What else can I do?
- I get “RuntimeError: Cannot get memory” and “RuntimeError: Output data was not allocated” when I quantize my model by the POT.
- I have successfully quantized my model with a low accuracy drop and improved performance but the output video generated from the low precision model is much worse than from the full precision model. What could be the root cause?
- The quantization process of my model takes a lot of time. Can it be decreased somehow?
- I get "Import Error:... No such file or directory". How can I avoid it?
- When I execute POT CLI, I get "File "/workspace/venv/lib/python3.6/site-packages/nevergrad/optimization/base.py", line 35... SyntaxError: invalid syntax". What is wrong?
- What does a message "ModuleNotFoundError: No module named 'some_module_name'" mean?
- Is there a way to collect an intermidiate IR when the AccuracyAware mechanism fails?
Is the Post-training Optimization Tool (POT) opensourced?
Yes, POT is developed on GitHub as a part of https://github.com/openvinotoolkit/openvino under Apache-2.0 License.
Can I quantize my model without a dataset?
In general, you should have a dataset. The dataset should be annotated if you want to validate the accuracy. If your dataset is not annotated, you can still quantize the model in the Simplified mode but you will not be able to measure the accuracy. See Post-Training Optimization Best Practices for more details. You can also use POT API to integrate the post-training quantization into the custom inference pipeline.
Can a model in any framework be quantized by the POT?
The POT accepts models in the OpenVINO™ Intermediate Representation (IR) format only. For that you need to convert your model to the IR format using [Model Optimizer](@ref openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide).
I'd like to quantize a model and I've converted it to IR but I don't have the Accuracy Checker config. What can I do?
To create the Accuracy Checker configuration file, refer to [Accuracy Checker documentation](@ref omz_tools_accuracy_checker) and try to find the configuration file for your model among the ones available in the Accuracy Checker examples. An alternative way is to quantize the model in the Simplified mode but you will not be able to measure the accuracy. See Post-Training Optimization Best Practices for more details. Also, you can use POT API to integrate the post-training quantization into your pipeline without the Accuracy Checker.
What is a tradeoff when you go to low precision?
The tradeoff is between the accuracy drop and performance. When a model is in low precision, it is usually performed compared to the same model in full precision but the accuracy might be worse. You can find some benchmarking results in [INT8 vs FP32 Comparison on Select Networks and Platforms](@ref openvino_docs_performance_int8_vs_fp32). The other benefit of having a model in low precision is its smaller size.
I tried all recommendations from "Post-Training Optimization Best Practices" but either have a high accuracy drop or bad performance after quantization. What else can I do?
First of all, you should validate the POT compression pipeline you are running, which can be done with the following steps:
- Make sure the accuracy of the original uncompressed model has the value you expect. Run your POT pipeline with an empty compression config and evaluate the resulting model metric. Compare this uncompressed model accuracy metric value with your reference.
- Run your compression pipeline with a single compression algorithm (DefaultQuantization or AccuracyAwareQuantization) without any parameter values specified in the config (except for
presetandstat_subset_size). Make sure you get the undesirable accuracy drop/performance gain in this case.
Finally, if you have done the steps above and the problem persists, you could try to compress your model using the Neural Network Compression Framework (NNCF). Note that NNCF usage requires you to have a PyTorch-based training pipeline of your model in order to perform compression-aware fine-tuning. See Low Precision Optimization Guide for more details.
I get “RuntimeError: Cannot get memory” and “RuntimeError: Output data was not allocated” when I quantize my model by the POT.
These issues happen due to insufficient available amount of memory for statistics collection during the quantization process of a huge model or due to a very high resolution of input images in the quantization dataset. If you do not have a possibility to increase your RAM size, one of the following options can help:
- Set
inplace_statisticparameters to "True". In that case the POT will change method collect statistics and use less memory. Note that such change might increase time required for quantization. - Set
eval_requests_numberandstat_requests_numberparameters to 1. In that case the POT will limit the number of infer requests by 1 and use less memory. Note that such change might increase time required for quantization. - Set
use_fast_biasparameter tofalse. In that case the POT will switch from the FastBiasCorrection algorithm to the full BiasCorrection algorithm which is usually more accurate and takes more time but requires less memory. See Post-Training Optimization Best Practices for more details. - Reshape your model to a lower resolution and resize the size of images in the dataset. Note that such change might impact the accuracy.
I have successfully quantized my model with a low accuracy drop and improved performance but the output video generated from the low precision model is much worse than from the full precision model. What could be the root cause?
It can happen due to the following reasons:
- A wrong or not representative dataset was used during the quantization and accuracy validation. Please make sure that your data and labels are correct and they sufficiently reflect the use case.
- A wrong Accuracy Checker configuration file was used during the quantization. Refer to [Accuracy Checker documentation](@ref omz_tools_accuracy_checker) for more information.
The quantization process of my model takes a lot of time. Can it be decreased somehow?
Quantization time depends on multiple factors such as the size of the model and the dataset. It also depends on the algorithm: the DefaultQuantization algorithm takes less time than the AccuracyAwareQuantization algorithm. The following configuration parameters also impact the quantization time duration (see details in Post-Training Optimization Best Practices):
use_fast_bias: when set tofalse, it increases the quantization timestat_subset_size: the higher the value of this parameter, the more time will be required for the quantizationtune_hyperparams: if set totruewhen the AccuracyAwareQuantization algorithm is used, it increases the quantization timestat_requests_number: the lower number, the more time might be required for the quantizationeval_requests_number: the lower number, the more time might be required for the quantization Note that higher values ofstat_requests_numberandeval_requests_numberincrease memory consumption by POT.
I get "Import Error:... No such file or directory". How can I avoid it?
It happens when some needed library is not available in your environment. To avoid it, execute the following command:
source <INSTALL_DIR>/bin/setupvars.sh
where <INSTALL_DIR> is the directory where the OpenVINO™ toolkit is installed.
When I execute POT CLI, I get "File "/workspace/venv/lib/python3.6/site-packages/nevergrad/optimization/base.py", line 35... SyntaxError: invalid syntax". What is wrong?
This error is reported when you have an older python version than 3.6 in your environment. Upgrade your python version. Refer to more details about the prerequisites on the Post-Training Optimization Tool page.
What does a message "ModuleNotFoundError: No module named 'some_module_name'" mean?
It means that some required python module is not installed in your environment. To install it, run pip install some_module_name.
Is there a way to collect an intermidiate IR when the AccuracyAware mechanism fails?
You can add "dump_intermediate_model": true to the POT configuration file and it will drop an intermidiate IR to accuracy_aware_intermediate folder.