Migrate from Model Registry to Model Download#

Model Download replaces Model Registry, which will be deprecated soon. Intel suggests the following migration approach, depending on your needs:

Category

Model Registry

Model Download

Migration Approach

Core Role

Model management system

Runtime model acquisition and preparation

Core usage shifts from model management to runtime fetching and model preparation before application startup.

Primary Purpose

Storage, version control, and model management

Fetches models, converts to OpenVINO™ Intermediate Representation (IR) format, optimizes via precision reduction and hardware tuning, and stores the models.

Replace registry storage with direct model pulls from external sources; no extra conversion or optimization steps needed.

Onboarding Process

Downloads models, compresses the packages, and uploads to the registry

No onboarding required; directly pulls models from external sources via API

Remove the manual onboarding flow; configure model source during setup and use pull API.

Model Sources

Only models that were uploaded to the registry

All supported models from multiple module hubs: Hugging Face / Ollama / Geti™ software / Ultralytics

Update model references to point to the source instead of the registry by enabling the required source plugins during setup and passing the appropriate model hub to the download API.

Storage Type

Centralized metadata database and object storage

Local filesystem storage or PersistentVolumeClaim (PVC)

Update applications to read models from the local filesystem path managed by Model Download. In Docker deployments, this path is typically mounted as a volume to persist downloaded models across container restarts. In Helm / Kubernetes deployments, this is configured using PersistentVolumeClaims (PVCs) to retain models across pod restarts and avoid redundant downloads. Shared PVCs are used between Model Download and dependent applications to enable direct access to downloaded models.

Metadata Storage

Stored in separate databases

Encoded in model path (name / device / precision)

No metadata management overhead as most of the metadata details are encoded in the model path. If needed, manage externally (e.g., use MLOps tools, config files, etc.).

Persistence

Strong centralized persistence

Persistent shared storage (host volume / PVC)

No change needed. In Docker deployments, models remain in local storage on the host machine; in Kubernetes, they are stored in a PVC until manual deletion. The app is lightweight and sufficient for runtime use.

Infrastructure Overhead

High: registry service, database, storage

Low: single service, local storage

Replace registry components with a single Model Download service, simplifying architecture and reducing maintenance.

Metadata Updates

Supported (score, format, etc.)

Not supported

Avoid continuous metadata maintenance. Use external systems or tools if needed.

Versioning

Mandatory and enforced

Not enforced

Reduce complexity for dynamic workloads with models pulled directly from hubs. If needed, fetch specific versions via version tags or identifiers. Use external tools if version management is required.

Conversion Support

Not supported

Automatic conversion to OpenVINO™ format

Enable OpenVINO™ plugin during setup and configure the required fields based on the parameters provided via the download API.

Precision Support

Not applicable

All OpenVINO™ toolkit-supported formats: INT4 / INT8 / FP16 / FP32

Specify precision in the download API configuration, if needed.

Device Targeting

Not supported

All OpenVINO™ toolkit-supported devices: CPU / GPU / NPU

Configure the target device in the download API configuration, if needed.

Parallel Downloads

Not supported

Parallel downloads of multiple models, leading to faster startup when multiple models are required

Enable parallel download flag in the Model Download API configuration.

Caching

No runtime caching

Configurable local caching: Reuses existing models, or skips re-download if models already exist

Specify model download path during setup; no further configuration needed.

API Style

CRUD-heavy: upload, list and delete models, update metadata

Minimal pull-based API with Optimum CLI compliance

Replace registry APIs with pull APIs to download models from the source at runtime. Optimum CLI compliance support enables the use of OpenVINO backend-compatible parameters for model export, compilation and quantization.

Model Listing

From the registry database

From the local filesystem

Replace registry dependencies with Model Download GET APIs.

Geti Integration

Import, store, and download

Direct pull from Geti™ software

Configure Geti™ details during setup and use the pull API; the Geti™ plugin handles integration.

Upload Models

Supported

Not supported

Not required; Model Download removes registry upload workflows and ensures model accesibility via the source.

Delete Models

Supported

Not supported

Delete downloaded models locally (manually or via cleanup scripts). Deletion of models at hub source is not supported.

Runtime Dependency

Not required

Mandatory before application startup

Ensure Model Download is deployed and ready before dependent services start.

Startup Dependency

None

Model Download must be available before dependent services start

Use API to check download job status and ensure completion before application startup.

Model Location

Stored in registry

Stored in local download path, ensuring fast local access

Update model paths in application configuration.

Operational Overhead

High: manage registry service, metadata database storage, model lifecycle; deployment, monitoring, debugging, and scaling

Low: single service, local storage only; fewer components to manage, reduced operational effort

No additional action required.

Scalability

Limited: central registry bottleneck, storage pressure with an increased number of models

Flexible: decentralized, independent downloads, local caching

No additional changes required. Model Download uses a decentralized approach in which each service manages models independently, scaling naturally.

Conclusion:

Model Registry provides centralized storage, metadata management, and versioning, while Model Download focuses on runtime model handling through direct fetching, conversion, optimization, and local caching.

As part of this transition: Registry-based workflows (upload, metadata management, and versioning) are not required. Basic metadata information is encoded in the model download path. If you need to maintain registry-based workflows, you will need to handle them externally.

Model access will shift from centralized storage to source-based retrieval and local filesystem storage. Update applications to read models from the local filesystem path managed by Model Download. In Docker deployments, this path is mounted as a volume for model persistence across restarts. In Kubernetes deployments, Persistent Volumes (PVCs) are used, often shared between Model Download and dependent applications for direct access and reuse.

Model Download becomes a mandatory runtime dependency to ensure models are available and ready before application startup.

Note: Currently, Model Download provides Helm charts for Kubernetes deployments; however, a separate deployment package is not yet available for Edge Manageability Framework. As a result, Model Download is integrated into the application-level deployment package. Intel will create a dedicated deployment package for Model Download.