Skip to main content
Ctrl+K

Open Edge Platform

    • Get Started
    • Metro AI Suite
    • Manufacturing AI Suite
    • Retail AI Suite
    • Robotics AI Suite
    • Education AI Suite
    • Edge AI Libraries
    • Edge Microvisor Toolkit
    • Edge Manageability Framework
  • Get Started
  • Metro AI Suite
  • Manufacturing AI Suite
  • Retail AI Suite
  • Robotics AI Suite
  • Education AI Suite
  • Edge AI Libraries
  • Edge Microvisor Toolkit
  • Edge Manageability Framework

Section Navigation

Tools

  • VIPPET
    • System Requirements
    • Get Started
    • Release Notes
    • Build from Source
    • How to use ViPPET
      • Pipeline configuration
      • Performance testing
    • Build and Use Video Generator
    • API Reference
    • Disclaimers
    • Known issues, limitations and troubleshooting
    • GitHub
  • SceneScape
    • System Requirements
    • Getting Started with Intel® SceneScape
    • Troubleshooting
    • API Reference
    • Using Intel® SceneScape
      • Deploying Intel® SceneScape
      • Tutorial
      • Using the 3D UI
      • Integrating Cameras and Sensors
      • Working with Spatial Analytics Data
    • Building a Scene
      • Creating a New Scene
      • Generating a Scene Map
      • Using different Sensor Types
      • Visualizing ROIs and Regions
      • Configuring a Hierarchy of Scenes
      • Configuring Geospatial Coordinates
      • Configuring Geospatial Map Service API Keys
      • Configuring Spatial Analytics
    • calibrating Cameras
      • Manual Camera Callibration
      • Automatic Camera Callibration - Visual Features
      • Automatic Camera Callibration - April Tags
    • Other Topics
      • Defining Object Properties
      • Enabling Re-identification
      • Integrating Intel® Geti™ AI Models
      • Configuring DLStreamer Video Pipeline
      • Model configuration file format
      • Running License Plate Recognition with 3D Object Detection
      • Managing Files in Volumes
    • Additional Resources
      • Intel® SceneScape Hardening Guide
      • How to Upgrade Intel® SceneScape
      • How Intel® SceneScape converts Pixel-Based Bounding Boxes to Normalized Image Space
      • Release Notes
  • Geti

Libraries

  • Anomalib
  • Datumaro
  • DL Streamer
    • Get Started
      • System Requirements
      • Install Guide
        • Install Guide Ubuntu
        • Install Guide Ubuntu 24.04 on WSL2
        • Uninstall Guide Ubuntu
        • Install Guide Windows
      • Tutorial
      • Samples
    • Developer Guide
      • Advanced Installation Guide
        • Advanced Installation On Ubuntu - Prerequisites
        • Advanced Installation - Compilation From Source
        • Advanced Installation on Windows - compilation from source files
        • Advanced Installation On Ubuntu - Build Docker Image
        • Advanced Uninstallation On Ubuntu
      • Metadata
      • Model Preparation
        • YOLO Models
        • Large Vision Models
        • Download Public Models
      • OpenVINO Custom Operations Support
      • Model Info Section
      • GStreamer Python Bindings
      • Custom GStreamer Plugin Installation
      • Custom Processing
      • Object Tracking
      • GPU device selection
      • Performance Guide
      • Profiling with Intel VTune™
      • Converting NVIDIA DeepStream Pipelines to Deep Learning Streamer Pipeline Framework
      • How to Contribute
        • Coding Style
      • Latency Tracer
      • Model-proc File (legacy)
        • How to Create Model-proc File
      • Optimizer
    • Elements
      • gvadetect
      • gvaclassify
      • gvainference
      • gvatrack
      • gvaaudiodetect
      • gvaaudiotranscribe
      • gvagenai
      • gvaattachroi
      • gvafpscounter
      • gvametaaggregate
      • gvametaconvert
      • gvametapublish
      • gvapython
      • gvarealsense
      • gvawatermark
      • gvamotiondetect
      • GStreamer Elements
        • Compositor
    • Supported Models
    • API Reference
    • Architecture 2.0
      • Migration to 2.0
      • Memory Interop and C++ abstract interfaces
      • ② C++ elements
      • ③ GStreamer Elements
      • ③ GStreamer Bin Elements
      • Python Bindings
      • PyTorch tensor inference
      • Elements 2.0
      • Packaging
      • Samples 2.0
      • API 2.0 Reference
    • Deep Learning Streamer Pipeline Framework Release 2025.1.2
  • PLCopen Motion Control
    • RTmotion Library
      • Installation & Setup
        • System Requirements
        • OS Setup
        • Real-Time in Linux
      • RTmotion Concept and Application Interface
    • Notices and Disclaimers
  • Geti SDK
  • EtherCAT Master Stack
  • Robot Motion Control Task
  • Video Chunking Utils
    • Release Notes

Microservices

  • Audio Analyzer
    • Get Started
      • System Requirements
    • How It Works
    • How to Build from Source
    • API Reference
    • Troubleshooting
    • Release Notes
  • DL Streamer Pipeline Server
    • Get Started
      • System Requirements
    • How to build from source
    • How to Deploy with Helm
    • How to launch and manage pipeline (via script)
    • How to launch configurable pipelines
    • How to change Deep Learning Streamer pipeline
    • How to autostart pipelines
    • How to run User Defined Function (UDF) pipelines
    • How to use GPU for decode and inference
    • How to use CPU for decode and inference
    • How to perform WebRTC frame streaming
    • How to publish metadata and frame over MQTT
    • How to publish frames to S3
    • How to publish metadata to InfluxDB
    • How to publish metadata over ROS2
    • How to use RTSP camera as a source
    • How to use image file as source over REST payload
    • How to download and run YOLO models
    • How to add system timestamps to metadata
    • API Reference
    • Environment Variables
    • Troubleshooting
    • Advanced user guide
      • Configuration
        • Basic Deep Learning Streamer Pipeline Server Configuration
      • REST API guide
        • REST Endpoints Reference Guide
        • Defining Media Analytics Pipelines
        • Customizing Pipeline Requests
      • Cameras
        • RTSP Cameras
      • File Ingestion
        • Image Ingestion
        • Video Ingestion
        • Multifilesrc Usage
      • User Defined Functions (UDF)
        • UDF Writing Guide
        • Configuring udfloader element
      • Publishers
        • MQTT Publishing via gvapython
        • MQTT Publishing
        • OPCUA Publishing post pipeline execution
        • S3 Frame Storage
      • How To Advanced
        • Model Update
        • Object tracking with UDF
        • Enable HTTPS for DL Streamer Pipeline Server (Optional)
        • Performance Analysis (Latency)
        • Pinning the DL Streamer Pipeline Sever to CPU cores
        • Get tensor vector data
        • Run multistream pipelines with shared model instance
        • Cross stream batching
        • Enable Open Telemetry
        • Working with other services
    • Release Notes
      • August 2025
      • April 2025
      • March 2025
      • February 2025
      • November 2024
      • October 2024
      • September 2024
      • July 2024
  • Document Ingestion - PGVector
    • Get Started
      • System Requirements
    • Build and customize options
    • API Reference
  • Model Registry
    • How It Works
    • Get Started
      • System Requirements
    • Environment Variables
    • How to Build from Source
    • How to Deploy with Helm
    • How to Interface with Intel® Geti™ Software
    • API Reference
    • Release Notes
  • Multimodal Embedding Serving
  • Time Series Analytics
    • High-Level Architecture
    • System Requirements
    • Get Started
    • Access Time Series Analytics Microservice API
    • Configuring Time Series Analytics Microservice
    • Deploy with Helm
    • API Reference
    • Release Notes
      • August 2025
      • December 2025
  • Vector Retriever - milvus
    • Get Started Guide
      • System Requirements
    • API Reference
    • Release Notes
  • Visual Data Preparation For Retrieval
  • VLM OpenVINO Serving
  • Multi-level Video Understanding
    • Get Started
      • System Requirements
      • How to Build from Source
    • API Reference
    • Release Notes

Sample Applications

  • Chat Q&A
    • Get Started
      • System Requirements
    • How It Works
    • How to Build from Source
    • How to deploy with Helm
    • Deploy with Edge Orchestrator
    • How to Test Performance
    • Benchmarks
    • API Reference
    • Release Notes
  • Chat Q&A Core
    • Get Started
      • System Requirements
    • How to Build from Source
    • How to deploy with Helm
    • Benchmarks
    • API Reference
    • Release Notes
  • Document Summarization
    • Get Started
      • System Requirements
    • Architecture
    • How to Build from Source
    • How to deploy with Helm
    • How to Test Performance
    • API Reference
    • Troubleshooting
    • Release Notes
  • Video Search and Summarization
    • Get Started
      • System Requirements
    • How It Works
      • Video Search
      • Video Summarization
      • Video Search and Summarization
    • How to Build from Source
    • How to deploy with Helm* Chart
    • Directory Watcher Service Guide
    • API Reference
    • Troubleshooting
    • Release Notes

Model Deployment

  • OpenVINO
  • OpenVINO Model Server
  • Edge AI Libraries
  • Video Search and Summarization (VSS) Sample Application
  • How It Works
  • Video Search

Video Search#

The application is built on a modular microservices approach using the LangChain framework.

System architecture
*Figure 1: Video Search mode system architecture

Pipeline Components#

The following are the Video Search pipeline’s components:

  • Video Search UI: You can use the reference UI to interact with and raise queries to the Video Search sample application. You can mark a query to run in the background for the current video corpus or all incoming videos.

  • Visual Data Prep. microservice: The sample Visual Data Prep. microservice allows ingestion of video from the object store. The ingestion process creates embeddings of the videos and stores them in the preferred vector database. The modular architecture allows you to customize the vector database. The sample application uses the Visual Data Management System (VDMS) database. The raw videos are stored in the MinIO object store, which is also customizable.

  • Video Search backend microservice: The Video Search backend microservice embeds user queries and generates responses based on the search result. The vector database is queried for a best match in the embedding space.

  • Embedding inference microservice: The OpenVINO™ toolkit-based microservice runs embedding models on the target Intel® hardware.

  • Reranking inference microservice: Though an option, the reranker is currently not used in the pipeline. The OpenVINO™ model server runs the reranker models.

Note: Although the reranker is shown in the figure, support for the reranker depends on the vector database used. The default Video Search pipeline uses the VDMS vector database, where there is no support for the reranker. See details on the system architecture below.

Detailed Architecture#

The Video Search pipeline combines core LangChain application logic and a set of microservices. The following figures show the architecture.

Video Ingestion Architecture#

Video ingestion technical architecture

Video Query Architecture#

Video query technical architecture

The Video Search UI communicates with the Video Search backend microservice. The Embedding microservice is provided as part of Intel’s Edge AI inference microservices catalog, supporting open-source models that can be downloaded from model hubs, for example Hugging Face Hub models that integrate with OpenVINO™ toolkit.

The Visual Data Prep. microservice ingests common video formats, converts them into embedding space, and store them in the vector database. You can also save a copy of the video to the object store.

Application Flow#

  1. Input Sources:

    • Videos: The Visual Data Prep. microservice ingests common video formats. Currently, the ingestion only supports video files; it does not support live-streaming inputs.

  2. Create Context

    • Upload input videos: The UI microservice allows you to interact with the application through the defined application API, and provides an interface for you to upload videos. The application stores the videos in the MinIO database. Videos can be ingested continuously from pre-configured folder locations, for surveillance scenarios.

    • Convert to embeddings space: The Video Ingestion microservice creates the embeddings from the uploaded videos using the embedded microservice. The application stores the embeddings in Visual Data Management System (VDMS).

  3. Query Flow

    • Input a query: The UI microservice provides a prompt window for user queries that can be saved. You can enable up to eight queries to run in the background continuously on any new video being ingested. This is a critical capability for agentic reasoning.

    • Execute the Video Search pipeline: The Video Search backend microservice does the following to generate the output response:

      • Converts the query into an embedding space using the Embeddings microservice.

      • Does a semantic retrieval to fetch the relevant videos from the vector database. Currently, the top-k (with k being configurable) video is used. Does not use a reranker microservice currently.

  4. Generate the Output:

    • Response: The application sends the search results, including the retrieved video from object store, to the UI.

    • Observability dashboard: If set up, the dashboard displays real-time logs, metrics, and traces, which shows the application’s performance, accuracy, and resource consumption.

The following figure shows the application flow, including the APIs and data sharing protocols: Data flow figure
*Data flow for Video Search mode

Key Components and Their Roles#

The key components of the Video Search mode are as follows:

  1. Intel’s Edge AI Inference microservices:

    • What it is: Inference microservices are the embeddings and reranker microservices that run the chosen models on the hardware, optimally.

    • How it is used: Each microservice uses OpenAI APIs to support their functionality. The microservices are configured to use the required models and are ready. The Video Search backend accesses these microservices in the LangChain application, which creates a chain out of these microservices.

    • Benefits: Intel guarantees that the sample application’s default microservices configuration is optimal for the chosen models and the target deployment hardware. Standard OpenAI APIs ensure easy portability of different inference microservices.

  2. Visual Data Prep. microservice:

    • What it is: This microservice ingests videos, creates the necessary context, and retrieves the right context based on user query.

    • How it is used: Video ingestion microservice provides a REST API endpoint that can be used to manage the contents. The Video Search backend uses this API to access its capabilities.

    • Benefits: The core part of the video ingestion functionality is the vector handling capability that is optimized for the target deployment hardware. You can select the vector database based on performance considerations. You can treat this microservice as a reference implementation.

  3. Video Search backend microservice:

    • What it is: Video Search backend microservice is a LangChain framework-based implementation of Video Search’s Retrieval-Augmented Generation (RAG) pipeline, which handles user queries.

    • How it is used: The UI frontend uses a REST API endpoint to send user queries and trigger the Video Search pipeline.

    • Benefits: This microservice provides a reference on using the LangChain framework through Intel’s Edge AI inference microservices.

  4. Video Search UI:

    • What it is: A reference frontend interface for you to interact with the Video Search pipeline.

    • How it is used: The UI microservice runs on the deployed platform on a certain configured port. You can access the specific URL to use the UI.

    • Benefits: You can treat this microservice as a reference implementation.

Extensibility#

The Video Search mode is modular and allows you to:

  1. Change inference microservices:

    • The default option is OpenVINO™ model server. You can use other model servers, for example the Virtual Large Language Model (vLLM) with OpenVINO model server as backend, and the Text Generation Inference (TGI) toolkit to host Embedding and Vision-Language Models (VLMs) but Intel has not validated this method.

    • The compulsory requirement is OpenAI API compliance. Intel does not guarantee that other model servers can provide the same performance compared to the default options.

  2. Load different embedding and reranker models:

    • Use models from Hugging Face Hub that integrate with OpenVINO toolkit, or from vLLM model hub. The models are passed as parameters to the corresponding model servers.

  3. Use other generative AI frameworks like the Haystack framework and LlamaIndex tool:

    • Integrate the inference microservices into an application backend developed on other frameworks similar to the LangChain framework integration provided in this sample application.

  4. Deploy on diverse target Intel® hardware and deployment scenarios:

    • Follow the system requirements guidelines on the options available.

Next Steps#

  • System requirements

  • Get Started

On this page
  • Pipeline Components
  • Detailed Architecture
    • Video Ingestion Architecture
    • Video Query Architecture
    • Application Flow
  • Key Components and Their Roles
  • Extensibility
  • Next Steps

This Page

  • Show Source

© Copyright 2025, Intel Corporation.

Built with the PyData Sphinx Theme 0.16.1.