How to Deploy with Helm#
This guide provides step-by-step instructions for deploying the Smart Traffic Intersection Agent application using Helm.
Prerequisites#
Before you begin, ensure that you have the following prerequisites:
Kubernetes cluster set up and running.
The cluster must support dynamic provisioning of Persistent Volumes (PV). Refer to the Kubernetes Dynamic Provisioning Guide for more details.
Install
kubectlon your system. Refer to the Installation Guide. Ensure access to the Kubernetes cluster.Helm installed on your system: Installation Guide.
A running Smart Intersection deployment (provides MQTT broker, camera pipelines, and scene analytics). See Step 4 below.
The SceneScape CA certificate file (
scenescape-ca.pem) for TLS connections to the MQTT broker (created during the Smart Intersection installation).(Optional) A Hugging Face API token if the VLM model requires authentication.
Storage Requirement: The VLM model cache PVC requests 20 GiB by default. Ensure the cluster has sufficient storage available.
(Optional — GPU inference) To run VLM inference on an Intel GPU:
An Intel integrated, Arc, or Data Center GPU must be available on at least one worker node.
The Intel GPU device plugin for Kubernetes must be installed so that GPU resources (e.g.,
gpu.intel.com/i915orgpu.intel.com/xe) are advertised to the scheduler. Verify by running:kubectl describe node <gpu-node> | grep gpu.intel.com
The
/dev/dri/renderD*device must be accessible inside containers. The Helm chart automatically adds the correctsupplementalGroupsentry for the render group.
Steps to Deploy with Helm#
The following steps walk through deploying the Smart Traffic Intersection Agent application using Helm. You can install from source code or pull the chart from a registry.
Steps 1 to 3 vary depending on whether you prefer to build or pull the Helm chart.
Option 1: Install from a Registry#
Step 1: Pull the Chart#
Use the following command to pull the Helm chart:
helm pull oci://registry-1.docker.io/intel/smart-traffic-intersection-agent --version 1.0.0-rc2-helm
Step 2: Extract the .tgz File#
After pulling the chart, extract the .tgz file:
tar -xvf smart-traffic-intersection-agent-1.0.0-rc2-helm.tgz
Navigate to the extracted directory:
cd smart-traffic-intersection-agent
Step 3: Configure the values.yaml File#
Edit the values.yaml file to set the necessary environment variables. Refer to the values reference table below.
Option 2: Install from Source#
Step 1: Clone the Repository#
Clone the repository containing the Helm chart:
# Clone the release branch
git clone https://github.com/open-edge-platform/edge-ai-suites.git -b release-2026.0.0
Step 2: Change to the Chart Directory#
Navigate to the chart directory:
cd edge-ai-suites/metro-ai-suite/smart-traffic-intersection-agent/chart
Step 3: Configure the values.yaml File#
Edit the values.yaml file located in the chart directory to set the necessary environment variables. Refer to the values reference table below.
Common Steps After Configuration#
Step 4: Deploy Smart Intersection#
The Smart Traffic Intersection Agent depends on a running Smart Intersection deployment, which includes SceneScape. It provides the MQTT broker, camera pipelines, and scene analytics that the Traffic Agent consumes.
Follow the Smart Intersection Helm Deployment Guide to deploy it. Once all Smart Intersection pods are running and the MQTT broker is reachable, proceed to the next step.
Step 5: Configure GPU Support (Optional)#
By default, the chart deploys VLM inference on an Intel GPU. To change graph or verify GPU configuration, edit the following values in values.yaml:
Value |
Description |
Default |
|---|---|---|
|
Enable Intel GPU for VLM inference. When |
|
|
Kubernetes GPU resource name exposed by the Intel device plugin. Use |
|
|
Number of GPU devices to request |
|
|
List of render group GIDs for |
|
|
Pin VLM pod to nodes with GPUs (e.g., |
|
Identify your cluster’s GPU resource key by running:
kubectl describe node <gpu-node> | grep gpu.intel.com
To deploy on CPU instead, set:
helm install stia . -n <your-namespace> --create-namespace \
--set vlmServing.gpu.enabled=false
Note: The
OV_CONFIGenvironment variable is automatically set based on the device. When GPU is enabled, CPU-only options likeINFERENCE_NUM_THREADSare excluded to avoid runtime errors.
Step 6: Deploy the Helm Chart#
Deploy the Smart Traffic Intersection Agent Helm chart:
helm install stia . -n <your-namespace> --create-namespace
Note: The VLM OpenVINO Serving pod will download and convert the model on first startup. This may take several minutes depending on network speed and model size. To avoid re-downloading the model on every install cycle, set
vlmServing.persistence.keepOnUninstalltotrue(the default). This tells Helm to retain the model cache PVC on uninstall.
Step 7: Verify the Deployment#
Check the status of the deployed resources to ensure everything is running correctly:
kubectl get pods -n <your-namespace>
kubectl get services -n <your-namespace>
You should see two pods:
Pod |
Description |
|---|---|
|
The traffic intersection agent (backend + Gradio UI) |
|
The VLM inference server |
Wait until both pods show Running and READY 1/1:
kubectl wait --for=condition=ready pod -l app.kubernetes.io/instance=stia -n <your-namespace> --timeout=600s
Step 8: Access the Application#
Using NodePort (default)#
The chart deploys services as NodePort by default. Retrieve the allocated ports and a node IP:
# Get the NodePort values
kubectl get svc stia-traffic-agent -n <your-namespace>
# Get the node IP
kubectl get nodes -o wide
# Use the INTERNAL-IP of any node
Then open your browser at:
http://<node-ip>:<backend-node-port> # Backend API (default NodePort: 30881)
http://<node-ip>:<ui-node-port> # Gradio UI (default NodePort: 30860)
Note: If you are behind a corporate proxy, make sure the node IPs are included in your
no_proxy/ browser proxy exceptions.
Using Port-Forward (ClusterIP)#
If you changed the service type to ClusterIP in values.yaml:
# Traffic Agent Backend API
kubectl port-forward svc/stia-traffic-agent 8081:8081 -n <your-namespace> &
# Traffic Agent Gradio UI
kubectl port-forward svc/stia-traffic-agent 7860:7860 -n <your-namespace> &
Then open your browser at:
Backend API:
http://127.0.0.1:8081/docsGradio UI:
http://127.0.0.1:7860
Step 9: Uninstall the Helm Chart#
To uninstall the deployed Helm chart:
helm uninstall stia -n <your-namespace>
Note: When
vlmServing.persistence.keepOnUninstallistrue(the default), the VLM model cache PVC is retained after uninstall to avoid re-downloading the model. This is recommended during development and testing. To fully clean up all PVCs:kubectl get pvc -n <your-namespace> kubectl delete pvc <pvc-name> -n <your-namespace>To have Helm delete the PVC automatically on uninstall, set
vlmServing.persistence.keepOnUninstall=falsebefore deploying.
values.yaml Reference#
Global Settings#
Key |
Description |
Default |
|---|---|---|
|
HTTP proxy URL |
|
|
HTTPS proxy URL |
|
|
Comma-separated no-proxy list |
|
Traffic Agent Settings#
Key |
Description |
Default |
|---|---|---|
|
Traffic agent container image repository |
|
|
Image tag |
|
|
Kubernetes service type ( |
|
|
Backend API port |
|
|
NodePort for backend API (only used when type is |
|
|
Gradio UI port |
|
|
NodePort for Gradio UI (only used when type is |
|
|
Unique intersection identifier |
|
|
Intersection latitude |
|
|
Intersection longitude |
|
|
Application log level |
|
|
Dashboard refresh interval (seconds) |
|
|
Use mock weather data ( |
|
|
Timeout for VLM inference requests (seconds) |
|
|
MQTT broker hostname (SceneScape K8s service name) |
|
|
MQTT broker port |
|
|
Object count for high-density classification |
|
|
Object count for moderate-density classification |
|
|
Traffic analysis buffer window |
|
|
Enable persistent storage for agent data |
|
|
PVC size for agent data |
|
|
Storage class (empty = cluster default) |
|
VLM OpenVINO Serving Settings#
Key |
Description |
Default |
|---|---|---|
|
VLM serving container image repository |
|
|
Image tag |
|
|
Kubernetes service type ( |
|
|
VLM HTTP API port |
|
|
NodePort for VLM API (only used when type is |
|
|
Hugging Face model identifier |
|
|
Model weight format ( |
|
|
OpenVINO inference device when GPU is disabled ( |
|
|
Max tokens per completion |
|
|
Number of serving workers. Forced to |
|
|
VLM serving log level |
|
|
OpenVINO runtime log level |
|
|
Access log file path ( |
|
|
Random seed for reproducible inference |
|
|
OpenVINO config JSON for CPU mode (supports |
|
|
OpenVINO config JSON for GPU mode (includes GPU model cache) |
|
|
Hugging Face API token (stored as a Secret) |
|
|
Enable Intel GPU for VLM inference. Auto-sets |
|
|
Kubernetes GPU resource name exposed by the Intel device plugin ( |
|
|
Number of GPU devices to request |
|
|
List of GIDs for the |
|
|
Pin VLM pod to GPU nodes (e.g., |
|
|
Enable persistent storage for model cache |
|
|
PVC size for model cache |
|
|
Storage class (empty = cluster default) |
|
|
Retain PVC on |
|
TLS / Secrets Settings#
Key |
Description |
Default |
|---|---|---|
|
PEM-encoded CA certificate for the MQTT broker (base64-encoded in the Secret) |
|
|
Name of an existing Secret containing the CA cert (overrides |
|
|
Key name inside the external secret (required when |
|
Example: Minimal Deployment#
# values-override.yaml
global:
proxy:
httpProxy: "http://proxy.example.com:8080"
httpsProxy: "http://proxy.example.com:8080"
noProxy: "localhost,127.0.0.1,10.0.0.0/8,.example.com"
trafficAgent:
intersection:
name: "intersection_main_st"
latitude: "37.7749"
longitude: "-122.4194"
mqtt:
host: "smart-intersection-broker"
tls:
caCert: |
-----BEGIN CERTIFICATE-----
MIIDxTCCA...
-----END CERTIFICATE-----
helm install stia . -n traffic -f values-override.yaml --create-namespace
Example: GPU Deployment#
To deploy VLM inference on an Intel GPU (the default), ensure vlmServing.gpu.enabled is true and the GPU resource name matches your cluster:
# values-gpu-override.yaml
vlmServing:
gpu:
enabled: true
# Use "gpu.intel.com/i915" for integrated / Arc A-series
# Use "gpu.intel.com/xe" for Data Center GPU Flex / Max
resourceName: "gpu.intel.com/i915"
resourceLimit: 1
# All common render group GIDs included by default — works across distros
renderGroupIds:
- 44
- 109
- 992
# Optional: pin to GPU nodes
nodeSelector:
intel.feature.node.kubernetes.io/gpu: "true"
persistence:
keepOnUninstall: true
helm install stia . -n traffic -f values-override.yaml -f values-gpu-override.yaml --create-namespace
Example: CPU-Only Deployment#
To run VLM inference on CPU:
helm install stia . -n traffic -f values-override.yaml \
--set vlmServing.gpu.enabled=false \
--create-namespace
Verification#
Ensure that all pods are running and the services are accessible.
Access the Gradio UI and verify that it is showing the traffic intersection dashboard.
Check the backend API at
/docsfor the interactive Swagger documentation.Verify that the traffic agent is receiving MQTT messages from SceneScape by checking the logs:
kubectl logs -l app=stia-traffic-agent -n <your-namespace> -f
Troubleshooting#
If you encounter any issues during the deployment process, check the Kubernetes logs for errors:
kubectl logs <pod-name> -n <your-namespace>
VLM pod stuck in CrashLoopBackOff: The model download may have failed. Check logs and verify proxy settings (
global.proxy.httpProxy/global.proxy.httpsProxy) andhuggingfaceTokenif the model requires authentication.VLM model download stuck or not progressing: Verify that proxy environment variables are correctly set inside the pod. A common cause is a mismatch between
values.yamlkey names and the template references (e.g.,http_proxyvshttpProxy). Check with:kubectl exec <vlm-pod-name> -n <your-namespace> -- env | grep -i proxy
Option not found: INFERENCE_NUM_THREADSerror on GPU: This occurs when theOV_CONFIGcontains CPU-only options while running on GPU. EnsurevlmServing.env.ovConfigGpudoes not includeINFERENCE_NUM_THREADS. The chart automatically selects the correct config (ovConfigCpuorovConfigGpu) based onvlmServing.gpu.enabled.GPU not detected / VLM pod Pending: Verify the Intel GPU device plugin is installed and the GPU resource is available:
kubectl describe node <gpu-node> | grep gpu.intel.com
If no GPU resource is listed, install the Intel GPU device plugin for Kubernetes. Also verify that
vlmServing.gpu.resourceNamematches the resource key reported by the device plugin (gpu.intel.com/i915for integrated/Arc,gpu.intel.com/xefor Data Center GPUs).GPU permission denied (
/dev/driaccess): The chart includes all common render group GIDs (44, 109, 992) by default. If your distro uses a different GID, find it withgetent group renderon the node and override:helm install stia . --set-json 'vlmServing.gpu.renderGroupIds=[<your-gid>]'
Traffic agent cannot connect to MQTT broker: Verify that the SceneScape deployment is reachable from the cluster, the
trafficAgent.mqtt.hostvalue is correct, and the CA certificate is provided viatls.caCertortls.caCertSecretName.PVC not cleaned up after uninstall: When
vlmServing.persistence.keepOnUninstallistrue(the default), the model cache PVC is intentionally retained. To reclaim storage, delete it manually:# List the PVCs present in the given namespace kubectl get pvc -n <your-namespace> # Delete the required PVC from the namespace kubectl delete pvc <pvc-name> -n <your-namespace>