[DOC]: Added INT4 weight compression description (#20812)
* Added INT4 information into weight compression doc * Added GPTQ info. Fixed comments * Fixed list * Fixed issues. Updated Gen.AI doc * Applied comments * Added additional infor about GPTQ support * Fixed typos * Update docs/articles_en/openvino_workflow/gen_ai.md Co-authored-by: Nico Galoppo <nico.galoppo@intel.com> * Update docs/articles_en/openvino_workflow/gen_ai.md Co-authored-by: Nico Galoppo <nico.galoppo@intel.com> * Update docs/optimization_guide/nncf/code/weight_compression_openvino.py Co-authored-by: Nico Galoppo <nico.galoppo@intel.com> * Applied changes * Update docs/articles_en/openvino_workflow/gen_ai.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/articles_en/openvino_workflow/gen_ai.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/articles_en/openvino_workflow/gen_ai.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/articles_en/openvino_workflow/model_optimization_guide/weight_compression.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Added table with results * One more comment --------- Co-authored-by: Nico Galoppo <nico.galoppo@intel.com> Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>
This commit is contained in:
@@ -3,4 +3,11 @@ from nncf import compress_weights
|
||||
|
||||
...
|
||||
model = compress_weights(model) # model is openvino.Model object
|
||||
#! [compression_8bit]
|
||||
#! [compression_8bit]
|
||||
|
||||
#! [compression_4bit]
|
||||
from nncf import compress_weights, CompressWeightsMode
|
||||
|
||||
...
|
||||
model = compress_weights(model, mode=CompressWeightsMode.INT4_SYM, group_size=128, ratio=0.8) # model is openvino.Model object
|
||||
#! [compression_4bit]
|
||||
Reference in New Issue
Block a user