How to Deploy with Helm Chart#

This guide shows how to deploy the Live Video Captioning application on Kubernetes with the Helm chart included in this repository.

Prerequisites#

Before you begin, ensure that you have the following:

  • A Kubernetes cluster with kubectl configured for access.

  • Helm installed on your system. See the Installation Guide.

  • Dynamic Persistent Volume provisioning available in the cluster, or a StorageClass you can set in the chart values.

  • A worker node reachable by your browser client. Prefer a GPU-capable worker node when available, because the chart pins the media and inference workloads to the selected node and DL Streamer benefits most from GPU access.

  • Sufficient storage for model PVCs. The default chart configuration requests 50Gi for VLM models and 5Gi for detection models.

  • An RTSP source reachable from the Kubernetes node that runs dlstreamer-pipeline-server.

Prepare the Cluster#

1. Select the target node#

The chart pins the workloads that need to stay together to the target node selected in the chart values:

  • dlstreamer-pipeline-server

  • video-caption-service

  • mediamtx

  • coturn

  • collector

These workloads are kept on the same worker because they rely on node-local access patterns:

  • dlstreamer-pipeline-server and video-caption-service share the model PVCs.

  • dlstreamer-pipeline-server and collector need direct access to node hardware and host resources.

  • mediamtx and coturn expose browser-facing WebRTC and TURN endpoints that must match the selected node’s reachable IP.

For best performance, choose a worker node with a GPU. The chart can run with CPU-only inference, but a GPU-capable node is the preferred deployment target for DL Streamer and real-time media processing.

Set global.nodeName to the Kubernetes node name.

Example:

global:
  nodeName: worker4

2. Get the IP of the selected node#

Use the same node that you selected for the pinned media workloads. First list the nodes and labels:

kubectl get nodes --show-labels

Then inspect the selected node:

kubectl get node <node-name> -o wide

Set global.hostIP to the node address that is reachable by the browser:

  • In clusters without worker-node external IPs, use INTERNAL-IP.

  • Use EXTERNAL-IP only if the node actually has one and your browser reaches the application through it.

  • Use INTERNAL-IP when your browser is on the same LAN or VPN and can reach the node directly.

To print the value directly:

kubectl get node <node-name> -o jsonpath='{.status.addresses[?(@.type=="ExternalIP")].address}'

If no external address is present, use:

kubectl get node <node-name> -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}'

Set that value in global.hostIP.

If the worker node does not have any browser-reachable IP, direct NodePort access will not work. This capability will be added to the chart in a future update.

Known Limitations#

Single-node deployment with host port binding#

This chart is designed to run on a single worker node. Several workloads bind directly to host ports on that node so that the browser and RTSP clients can reach them without a LoadBalancer or Ingress.

Because of these host port bindings:

  • replicaCount must remain 1 for all workloads that use host ports. Increasing it will fail at scheduling time because two pods cannot bind the same host port on the same node.

  • Multi-node or high-availability deployments are not supported. The chart intentionally pins all workloads to a single node via global.nodeName.

  • Port conflicts with other applications on the same node are possible. Ensure the ports listed above are not already in use on the target worker node before deploying.

Configure Required Values#

The chart includes a sample override file at charts/values-override.yaml. Update it before deploying.

The most important values are:

Key

Description

Example

global.hostIP

Browser-reachable IP of the selected node that runs the pinned media workloads. In many on-prem clusters this is the node INTERNAL-IP. Retrieve it with kubectl get node <node-name> -o wide

192.168.1.20

global.nodeName

Kubernetes node name used to pin the media, TURN, and host-coupled workloads to one worker node. Prefer a GPU-capable node when available

worker4

global.storageClassName

StorageClass for the chart PVCs. Leave empty to use the cluster default. If the default class uses node-local storage, see Known Issues

``

global.models

Required. List of VLM models to export to OpenVINO format. Must contain at least one entry — the chart will fail if this list is empty. The download job always runs and uses this list as its source of truth

OpenGVLab/InternVL2-1B

modelsPvc.size

PVC size for VLM models

50Gi

detectionModelsPvc.size

PVC size for object detection models

5Gi

modelsDownload.hfToken

HuggingFace access token for gated-model downloads, passed as a plain value. Leave empty for public models.

hf_abc123…

video-caption-service.env.enableDetectionPipeline

Enables the object-detection pipeline. When set to "true", the chart automatically downloads the models listed in global.detectionModels into the detection models PVC

"true" or "false"

global.detectionModels

List of detection model names to download. Each entry is passed to the DL Streamer download_public_models.sh helper. Only downloaded when enableDetectionPipeline is "true"

["yolov8s"]

video-caption-service.env.defaultRtspUrl

Default RTSP URL shown in the dashboard

rtsp://camera.example/live

video-caption-service.env.alertMode

Switches captioning to binary alert-style responses

"true" or "false"

HuggingFace Token for Gated Models#

Some models (for example google/gemma-3-4b-it) require a HuggingFace access token. Set modelsDownload.hfToken directly in your override file:

modelsDownload:
  hfToken: "hf_<your-token>"

The chart injects HF_TOKEN and HUGGINGFACEHUB_API_TOKEN as environment variables in the model download job. Leave the field empty for public models that do not require authentication.

Proxy Configuration#

If your cluster runs behind a proxy, set the proxy fields under global:

global:
  httpProxy: "http://<your-proxy-host>:<port>"
  httpsProxy: "http://<your-proxy-host>:<port>"
  noProxy: "<your-rtsp-camera-host-or-ip>"

Important: the host portion of every RTSP URL must be included in noProxy when the deployment runs behind a proxy.

For example:

  • If your stream URL is rtsp://camera.example.com:8554/live, add camera.example.com to noProxy.

  • If your stream URL is rtsp://192.168.1.50:554/stream1, add 192.168.1.50 to noProxy.

If the RTSP host is not listed in noProxy, the application may try to reach the stream through the proxy and fail to connect.

Build Chart Dependencies#

Run the following command from the chart directory:

helm dependency update

This refreshes the chart dependencies from subcharts/ and updates Chart.lock.

Install the Chart#

From charts/, install the application with the override file:

helm install lvc . \
  -f values-override.yaml \
  -n "$my_namespace"

Verify the Deployment#

Check the hook job, pods, services, and PVCs:

kubectl get jobs,pods,svc,pvc -n "$my_namespace"

The model downloader runs before the main workloads start. If the initial deployment takes time, inspect the job logs:

kubectl logs -n "$my_namespace" -l app.kubernetes.io/component=model-downloader

Before accessing the application, confirm the following:

  • The model download job has completed successfully.

  • All pods are in the Running state.

  • All containers report Ready.

  • The PVCs are bound.

The first deployment can take several minutes because the chart may download and export VLM models before starting the application pods.

Access the Application#

By default the chart exposes these NodePort services:

  • Dashboard UI: http://<global.hostIP>:4173

If you changed the service ports in your override values, use those instead.

To start captioning after deployment:

  1. Open the dashboard URL in your browser.

  2. Enter an RTSP stream URL, unless you preconfigured defaultRtspUrl.

  3. Select the model you downloaded into the models PVC.

  4. Adjust the prompt and generation parameters if needed.

  5. Start the stream.

Upgrade the Release#

If you modify the chart or subcharts, refresh dependencies first:

helm dependency update

Then upgrade the release:

helm upgrade lvc . \
  -f values-override.yaml \
  -n "$my_namespace" 

Uninstall the Release#

helm uninstall lvc -n "$my_namespace"

Troubleshooting#

  • If pods remain Pending, check that global.nodeName matches the correct node name, that the selected node has the required hardware access, and that the requested StorageClass can provision the PVCs.

  • If the install fails before pods appear, inspect the model download logs and confirm that the selected model ID and Hugging Face credentials are valid. Note that global.models must contain at least one entry — the chart will reject an empty list at render time.

  • If the dashboard opens but video does not start, confirm that global.hostIP is reachable from the browser. If your worker nodes do not have external IPs, this usually means using the node INTERNAL-IP over a reachable LAN or VPN. Also confirm that the RTSP source is reachable from the Kubernetes node.

  • If WebRTC negotiation fails, verify that global.hostIP points to the same node that runs mediamtx and coturn, and that the required ports are allowed by your network policy or firewall.

  • If detection is enabled but the pipeline cannot start, ensure the detection models PVC contains the required OpenVINO detection model artifacts.

  • If the collector does not report metrics, confirm that the host path in collector.collectorSignalsHostPath exists on the selected node and that the pod is scheduled there.