# How to Deploy with Helm Chart This guide shows how to deploy the Live Video Captioning application on Kubernetes with the Helm chart included in this repository. ## Prerequisites Before you begin, ensure that you have the following: - A Kubernetes cluster with `kubectl` configured for access. - Helm installed on your system. See the [Installation Guide](https://helm.sh/docs/intro/install/). - Dynamic Persistent Volume provisioning available in the cluster, or a `StorageClass` you can set in the chart values. - A worker node reachable by your browser client. Prefer a GPU-capable worker node when available, because the chart pins the media and inference workloads to the selected node and DL Streamer benefits most from GPU access. - Sufficient storage for model PVCs. The default chart configuration requests `50Gi` for VLM models and `5Gi` for detection models. - An RTSP source reachable from the Kubernetes node that runs `dlstreamer-pipeline-server`. ## Prepare the Cluster ### 1. Select the target node The chart pins the workloads that need to stay together to the target node selected in the chart values: - `dlstreamer-pipeline-server` - `video-caption-service` - `mediamtx` - `coturn` - `collector` These workloads are kept on the same worker because they rely on node-local access patterns: - `dlstreamer-pipeline-server` and `video-caption-service` share the model PVCs. - `dlstreamer-pipeline-server` and `collector` need direct access to node hardware and host resources. - `mediamtx` and `coturn` expose browser-facing WebRTC and TURN endpoints that must match the selected node's reachable IP. For best performance, choose a worker node with a GPU. The chart can run with CPU-only inference, but a GPU-capable node is the preferred deployment target for DL Streamer and real-time media processing. Set `global.nodeName` to the Kubernetes node name. Example: ```yaml global: nodeName: worker4 ``` ### 2. Get the IP of the selected node Use the same node that you selected for the pinned media workloads. First list the nodes and labels: ```bash kubectl get nodes --show-labels ``` Then inspect the selected node: ```bash kubectl get node -o wide ``` Set `global.hostIP` to the node address that is reachable by the browser: - In clusters without worker-node external IPs, use `INTERNAL-IP`. - Use `EXTERNAL-IP` only if the node actually has one and your browser reaches the application through it. - Use `INTERNAL-IP` when your browser is on the same LAN or VPN and can reach the node directly. To print the value directly: ```bash kubectl get node -o jsonpath='{.status.addresses[?(@.type=="ExternalIP")].address}' ``` If no external address is present, use: ```bash kubectl get node -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}' ``` Set that value in `global.hostIP`. If the worker node does not have any browser-reachable IP, direct NodePort access will not work. This capability will be added to the chart in a future update. ## Known Limitations ### Single-node deployment with host port binding This chart is designed to run on a **single worker node**. Several workloads bind directly to host ports on that node so that the browser and RTSP clients can reach them without a LoadBalancer or Ingress. Because of these host port bindings: - **`replicaCount` must remain `1`** for all workloads that use host ports. Increasing it will fail at scheduling time because two pods cannot bind the same host port on the same node. - **Multi-node or high-availability deployments are not supported.** The chart intentionally pins all workloads to a single node via `global.nodeName`. - **Port conflicts with other applications on the same node** are possible. Ensure the ports listed above are not already in use on the target worker node before deploying. ## Configure Required Values The chart includes a sample override file at `charts/values-override.yaml`. Update it before deploying. The most important values are: | Key | Description | Example | | --- | --- | --- | | `global.hostIP` | Browser-reachable IP of the selected node that runs the pinned media workloads. In many on-prem clusters this is the node `INTERNAL-IP`. Retrieve it with `kubectl get node -o wide` | `192.168.1.20` | | `global.nodeName` | Kubernetes node name used to pin the media, TURN, and host-coupled workloads to one worker node. Prefer a GPU-capable node when available | `worker4` | | `global.storageClassName` | StorageClass for the chart PVCs. Leave empty to use the cluster default. If the default class uses node-local storage, see [Known Issues](./known-issues.md#pvcs-bound-to-local-storage-prevent-reinstall-on-a-different-worker-node) | `` | | `global.models` | **Required.** List of VLM models to export to OpenVINO format. Must contain at least one entry — the chart will fail if this list is empty. The download job always runs and uses this list as its source of truth | `OpenGVLab/InternVL2-1B` | | `modelsPvc.size` | PVC size for VLM models | `50Gi` | | `detectionModelsPvc.size` | PVC size for object detection models | `5Gi` | | `modelsDownload.hfToken` | HuggingFace access token for gated-model downloads, passed as a plain value. Leave empty for public models. | `hf_abc123…` | | `video-caption-service.env.enableDetectionPipeline` | Enables the object-detection pipeline. When set to `"true"`, the chart automatically downloads the models listed in `global.detectionModels` into the detection models PVC | `"true"` or `"false"` | | `global.detectionModels` | List of detection model names to download. Each entry is passed to the DL Streamer `download_public_models.sh` helper. Only downloaded when `enableDetectionPipeline` is `"true"` | `["yolov8s"]` | | `video-caption-service.env.defaultRtspUrl` | Default RTSP URL shown in the dashboard | `rtsp://camera.example/live` | | `video-caption-service.env.alertMode` | Switches captioning to binary alert-style responses | `"true"` or `"false"` | ### HuggingFace Token for Gated Models Some models (for example `google/gemma-3-4b-it`) require a HuggingFace access token. Set `modelsDownload.hfToken` directly in your override file: ```yaml modelsDownload: hfToken: "hf_" ``` The chart injects `HF_TOKEN` and `HUGGINGFACEHUB_API_TOKEN` as environment variables in the model download job. Leave the field empty for public models that do not require authentication. ### Proxy Configuration If your cluster runs behind a proxy, set the proxy fields under `global`: ```yaml global: httpProxy: "http://:" httpsProxy: "http://:" noProxy: "" ``` Important: the host portion of every RTSP URL must be included in `noProxy` when the deployment runs behind a proxy. For example: - If your stream URL is `rtsp://camera.example.com:8554/live`, add `camera.example.com` to `noProxy`. - If your stream URL is `rtsp://192.168.1.50:554/stream1`, add `192.168.1.50` to `noProxy`. If the RTSP host is not listed in `noProxy`, the application may try to reach the stream through the proxy and fail to connect. ## Build Chart Dependencies Run the following command from the chart directory: ```bash helm dependency update ``` This refreshes the chart dependencies from `subcharts/` and updates `Chart.lock`. ## Install the Chart From `charts/`, install the application with the override file: ```bash helm install lvc . \ -f values-override.yaml \ -n "$my_namespace" ``` ## Verify the Deployment Check the hook job, pods, services, and PVCs: ```bash kubectl get jobs,pods,svc,pvc -n "$my_namespace" ``` The model downloader runs before the main workloads start. If the initial deployment takes time, inspect the job logs: ```bash kubectl logs -n "$my_namespace" -l app.kubernetes.io/component=model-downloader ``` Before accessing the application, confirm the following: - The model download job has completed successfully. - All pods are in the `Running` state. - All containers report `Ready`. - The PVCs are bound. The first deployment can take several minutes because the chart may download and export VLM models before starting the application pods. ## Access the Application By default the chart exposes these NodePort services: - Dashboard UI: `http://:4173` If you changed the service ports in your override values, use those instead. To start captioning after deployment: 1. Open the dashboard URL in your browser. 2. Enter an RTSP stream URL, unless you preconfigured `defaultRtspUrl`. 3. Select the model you downloaded into the models PVC. 4. Adjust the prompt and generation parameters if needed. 5. Start the stream. ## Upgrade the Release If you modify the chart or subcharts, refresh dependencies first: ```bash helm dependency update ``` Then upgrade the release: ```bash helm upgrade lvc . \ -f values-override.yaml \ -n "$my_namespace" ``` ## Uninstall the Release ```bash helm uninstall lvc -n "$my_namespace" ``` ## Troubleshooting - If pods remain `Pending`, check that `global.nodeName` matches the correct node name, that the selected node has the required hardware access, and that the requested `StorageClass` can provision the PVCs. - If the install fails before pods appear, inspect the model download logs and confirm that the selected model ID and Hugging Face credentials are valid. Note that `global.models` must contain at least one entry — the chart will reject an empty list at render time. - If the dashboard opens but video does not start, confirm that `global.hostIP` is reachable from the browser. If your worker nodes do not have external IPs, this usually means using the node `INTERNAL-IP` over a reachable LAN or VPN. Also confirm that the RTSP source is reachable from the Kubernetes node. - If WebRTC negotiation fails, verify that `global.hostIP` points to the same node that runs `mediamtx` and `coturn`, and that the required ports are allowed by your network policy or firewall. - If detection is enabled but the pipeline cannot start, ensure the detection models PVC contains the required OpenVINO detection model artifacts. - If the collector does not report metrics, confirm that the host path in `collector.collectorSignalsHostPath` exists on the selected node and that the pod is scheduled there. ## Related Links - [Get Started](./get-started.md) - [System Requirements](./get-started/system-requirements.md) - [How it Works](./how-it-works.md) - [Object Detection Pipeline](./object-detection-pipeline.md) - [Build from Source](./get-started/build-from-source.md)