# Migrate From Trtiton


[NVIDIA Triton Inference Server (Triton)](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html)
is an inference serving system from NVIDIA. Open Edge Platform's alternative for deploying
production-grade AI inference service is
[OpenVINO Model Server (OVMS)](https://docs.openvino.ai/2025/model-server/ovms_what_is_openvino_model_server.html)


Comparison Summary

| Feature                         | **NVIDIA Triton**                         | **Intel OVMS**                               |
| ------------------------------- | ----------------------------------------- | -------------------------------------------- |
| **Best hardware**               | NVIDIA GPUs                               | Intel CPUs, iGPUs, GPUs, NPUs, VPUs          |
| **Core engine**                 | TensorRT, CUDA                            | OpenVINO toolkit                             |
| **Frameworks supported**        | TensorFlow, PyTorch, ONNX, TensorRT, etc. | TensorFlow, ONNX (via OpenVINO)              |
| **Interfaces**                  | HTTP/gRPC                                 | HTTP/gRPC (TensorFlow Serving API)           |
| **Batching & optimization**     | Yes (dynamic, GPU-accelerated)            | Yes (CPU/GPU optimized)                      |
| **Metrics & observability**     | Prometheus, logs                          | Prometheus, logs                             |
| **Containerized deployments**   | Yes (NGC)                                 | Yes (DockerHub)                              |

If you want your serving to benefit from Intel architecture, you have several options:


:::{line-block}
**Deploy models on Triton Inference Server using the OpenVINO backend.**
     Check out the NVIDIA guide on [how to use Triton with the OpenVINO backend](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/tutorials/Quick_Deploy/OpenVINO/README.html)

**Migrate from Triton to OVMS**
     To start, make sure you use [model formats supported by OVMS](https://docs.openvino.ai/2025/model-server/ovms_docs_models_repository_classic.html).
     If not, you will need to [convert them](./convert-models.md) first. Once the models are
     ready, [organize your model repository](https://docs.openvino.ai/2025/model-server/ovms_docs_models_repository.html)
     to reflect the Triton’s repository layout.
     Configure each model with input/output definitions and versioning as required and
     [deploy OVMS](https://docs.openvino.ai/2025/model-server/ovms_docs_deploying_server.html).

     By loading the repository and exposing the REST and gRPC endpoints, you will make the
     models available to your existing pipeline. Because OVMS supports Triton’s client APIs,
     the existing clients should work without modification. However, features like shared
     memory interfaces or custom backends may need adjustments. To ensure proper quality,
     perform a comparative test for the new solution's accuracy and performance metrics.

:::