Configure Embedding Creation with RAG#
This guide explains how to enable caption embedding creation in Live Video Captioning and connect it with the Live-Video-Captioning-RAG service for Retrieval-Augmented Generation (RAG) chat.
When enabled:
Caption pipeline metadata includes frame blobs.
Live Video Captioning sends
caption_text+image_data+ metadata to the RAG embedding API.Embeddings are created and stored in VDMS.
You can open the Live Caption RAG dashboard and query generated context.
How Data Flows#
Live Video Captioning receives metadata from MQTT, which are published by DL Streamer Pipeline Server.
With
ENABLE_EMBEDDING=true, frame blobs are forwarded tolive-video-captioning-ragat/api/embeddings.RAG service generates embeddings through
multimodal-embedding-serving.Embeddings and metadata are stored in
vdms-vector-db.RAG chat (
/api/chat) retrieves relevant context and generates answers with the configured LLM.
Prerequisites#
Docker Engine software and Docker Compose tool are installed.
Complete the base setup in Get Started.
VLM models are prepared for the captioning pipeline (
ov_models/) while LLM models are prepared for the RAG pipeline (llm_models/). See Model Preparation section to download and convert the models.Ensure that this is a fresh installation. If you have deployed only live-video-captioning or only live-video-captioning-rag previously, stop those deployments and follow the instructions in this section to deploy both together.
Enabling Embedding Creation with RAG#
From the
live-video-captioningdirectory, use the provided helper script to set up the environment variables:cd edge-ai-suites/metro-ai-suite/live-video-analysis/live-video-captioning source scripts/setup_embeddings.sh
What this does:
Sets
ENABLE_EMBEDDING=true.Enables the Compose profile with
COMPOSE_PROFILES=EMBEDDING.Configures embedding service, VDMS, and RAG service environment variables.
Brings up these additional services:
multimodal-embedding-servingvdms-vector-dblive-video-captioning-rag
Notes:
Update the helper script values to use your preferred embedding and LLM models.
For gated models, export your HF_TOKEN before running the
setup_embeddings.shscript above:export HF_TOKEN=<your-huggingface-token>
Now you are ready to deploy the live-video-captioning with embedding creation and RAG:
docker compose up -d
Verify Services are Running#
Ensure that all the services containers are up and running, using the docker ps command. Ensure that the state is healthy before proceeding.
Optionally, you may verify health endpoints:
curl -f http://<HOST_IP>:4173/api/health
curl -f http://<HOST_IP>:4172/api/health
Run End-to-End with Embedding and RAG#
Open the Live Video Captioning UI at
http://<HOST_IP>:4173.Start a captioning run with a valid RTSP stream.
Confirm that captions are being generated.
Click the
chat iconin the top bar (visible only when embedding is enabled).This opens the Live Caption RAG dashboard at
http://<HOST_IP>:4172.Ask questions related to the current or recent scene.
Stop the Services#
docker compose down
Troubleshooting#
Chat Icon is not Visible in Live Captioning UI#
Ensure
ENABLE_EMBEDDING=trueandCOMPOSE_PROFILES=EMBEDDINGare exported before startup.
RAG Page Does not Open or is Unreachable#
Confirm that
live-video-captioning-ragcontainer is running.Confirm that port mapping
${LIVE_VIDEO_RAG_HOST_PORT:-4172}:4172is available.Check
http://localhost:4172/api/health.
Embeddings are Not Being Stored#
Ensure that the caption pipeline is actively running (not running means no ingestion).
Verify the embedding service health on
http://localhost:9777/health.Verify that the VDMS container is running.
If containers are running but no embeddings are stored, remove the volume and restart the services:
docker volume rm live-video-caption_vdms-db
Supporting Resources#
Get Started - Base setup and deployment flow
API Reference - Live Video Captioning API endpoints
System Requirements - Hardware and software requirements