# RAG Model Download This guide covers the optional Live Video Captioning RAG setup. These steps are not required for the base Live Video Captioning application. ## What RAG needs RAG uses: - the base VLM model for Live Video Captioning in `ov_models/`, - an LLM model cache in `llm_models/`, - embedding service settings configured by `scripts/setup_embeddings.sh`. ## Download the LLM model From the `live-video-captioning` directory: ```bash ./model_download_scripts/download_models.sh \ --model Qwen/Qwen2.5-3B-Instruct \ --type llm \ --device CPU \ --weight-format int8 ``` The model is prepared under `llm_models/`. For gated Hugging Face models, set a token first: ```bash export HUGGINGFACEHUB_API_TOKEN= ``` ## Review Embedding Defaults The default embeddings and LLM settings are in: ```text scripts/setup_embeddings.sh ``` Update these values only if you want different models or devices: ```bash EMBEDDING_MODEL_NAME=QwenText/qwen3-embedding-0.6b EMBEDDING_DEVICE=CPU LLM_DEVICE=CPU LLM_MODEL_ID=Qwen/Qwen2.5-3B-Instruct ``` ## Enable RAG services After downloading the LLM model, follow [Configure Embedding Creation with RAG](./configure-embedding-creation-with-rag.md) to enable the compose profile and start the RAG services.