# Model Optimization Original deep learning model architectures tend to be large and very complex. In many cases, smaller and simplified versions can be used. They do the job equally well but perform much better - an 8-bit model will use a fraction of the memory required by an FP64 one. That is why you should always use a model that is optimized for your use case. To do so, use [Neural Network Compression Framework (NNCF)](https://docs.openvino.ai/nncf), a collection of optimization algorithms that make your models smaller and faster. To learn more about it, check out NNCF documentation and articles on: :::{line-block} [Quantization (no retraining)](https://docs.openvino.ai/2025/openvino-workflow/model-optimization-guide/quantizing-models-post-training.html) The easiest way to optimize a model, it does not require retraining or fine-tuning, it just reduces the model size. Going from an FP64-based model to a quantized INT8 one will greatly improve the file size, memory footprint, throughput and latency. It may result in a drop in accuracy, though, which is why you should check if this accuracy-performance tradeoff is acceptable. [Weight Compression](https://docs.openvino.ai/2025/openvino-workflow/model-optimization-guide/weight-compression.html) An easy-to-use method targeting Large Language Models. It is a type of quantization that compresses only part of the model, its weights, not activations. It provides increased performance with relatively little impact on accuracy. [Training-time Optimization](https://docs.openvino.ai/2025/openvino-workflow/model-optimization-guide/compressing-models-during-training.html) A more complex and time-consuming method involving multiple algorithms that are executed while the model is retrained. It also requires the use of the model's original framework, for NNCF, it is either PyTorch or TensorFlow. With features such as Structured or Unstructured Pruning and Quantization-aware Training, it will give you just the model that fits your needs, optimally balancing its performance and accuracy. :::