How to deploy with Helm#

This guide provides step-by-step instructions for deploying the ChatQ&A Sample Application using Helm.

Prerequisites#

Before you begin, ensure that you have the following prerequisites:

Steps to deploy with Helm#

Following steps should be followed to deploy ChatQ&A using Helm. You can install from source code or pull the chart from Docker hub.

Steps 1 to 3 varies depending on if the user prefers to build or pull the Helm details.

Option 1: Install from Docker Hub#

Step 1: Pull the Specific Chart#

Use the following command to pull the Helm chart from Docker Hub:

helm pull oci://registry-1.docker.io/intel/chat-question-and-answer --version <version-no>

Refer to the release notes for details on the latest version number to use for the sample application.

Step 2: Extract the .tgz File#

After pulling the chart, extract the .tgz file:

tar -xvf chat-question-and-answer-<version-no>.tgz

This will create a directory named chat-question-and-answer containing the chart files. Navigate to the extracted directory.

cd chat-question-and-answer

Step 3: Configure the values.yaml File#

Choose the appropriate values*.yaml file based on the model server you want to use:

  • For OVMS: Use values_ovms.yaml.

  • For vLLM: Use values_vllm.yaml.

  • For TGI: Use values_tgi.yaml.

Edit the selected values*.yaml file to set the necessary environment variables. Ensure you set the huggingface.apiToken and proxy settings as required.

Key

Description

Example Value

global.huggingface.apiToken

Your Hugging Face API token

<your-huggingface-token>

global.POSTGRES_USER

Give User name for PG Vector DB

<your-postgres-user-id>

global.POSTGRES_PASSWORD

Give pwd for PG Vector DB

<your-postgres-password>

global.MINIO_ROOT_USER

A Minio server user name

<your-minio-user-id> (user length should be at least 3)

global.MINIO_ROOT_PASSWORD

A password to connect to minio server

<your-minio-password> (password length should be at least 8 characters)

global.OTLP_ENDPOINT

OTLP endpoint

global.OTLP_ENDPOINT_TRACE

OTLP endpoint for trace

global.teiEmbeddingService.enabled

Flag to enable TEI embedding model server

false

global.ovmsEmbeddingService.enabled

Flag to enable OVMS embedding model server

true

global.UI_NODEPORT

Sets the static port (in the 30000–32767 range)

Chatqna.name

Name of the ChatQnA application

chatqna

Chatqna.image.repository

image repository url

intel/chatqna

Chatqna.image.tag

latest image tag

1.1.2

Chatqna.env.ENDPOINT_URL

connection endpoint to model server

Chatqna.env.INDEX_NAME

index name for pgVector

intel-rag

Chatqna.env.FETCH_K

Number of top K results to fetch

10

Chatqna.global.EMBEDDING_MODEL_NAME

embedding model name

BAAI/bge-small-en-v1.5

Chatqna.env.PG_CONNECTION_STRING

pgvector connection string

postgresql+psycopg://

Chatqna.env.LLM_MODEL

model to be used with tgi/vllm/ovms

Intel/neural-chat-7b-v3-3

Chatqna.env.RERANKER_MODEL

model to be used with tei

BAAI/bge-reranker-base

Option 2: Install from Source#

Step 1: Clone the Repository#

Clone the repository containing the Helm chart:

git clone <repository-url>

Step 2: Change to the Chart Directory#

Navigate to the chart directory:

cd <repository-url>/sample-applications/chat-question-and-answer/chart

Step 3: Configure the values*.yaml File#

Edit the values*.yaml file located in the chart directory to set the necessary environment variables. Refer to the table in Option 1, Step 3 for the list of keys and example values.

Step 4: Build Helm Dependencies#

Navigate to the chart directory and build the Helm dependencies using the following command:

helm dependency build

Common Steps after configuration#

Step 5: Deploy the Helm Chart#

Deploy the OVMS Helm chart:

helm install chatqna . -f values_ovms.yaml -n <your-namespace>

Note: When deploying OVMS, the OVMS service is observed to take more time than other model serving due to model conversion time.

Deploy the vLLM Helm chart:

helm install chatqna . -f values_vllm.yaml -n <your-namespace>

Deploy the TGI Helm chart:

helm install chatqna . -f values_tgi.yaml -n <your-namespace>

Step 6: Verify the Deployment#

Check the status of the deployed resources to ensure everything is running correctly

kubectl get pods -n <your-namespace>
kubectl get services -n <your-namespace>

Step 7: Retrieving the Service Endpoint (NodePort and NodeIP)#

Open the UI in a browser at http://<node-ip>:<ui-node-port>

Step 8: Update Helm Dependencies#

If any changes are made to the subcharts, update the Helm dependencies using the following command:

helm dependency update

Step 9: Uninstall Helm chart#

To uninstall helm charts deployed, use the following command:

helm uninstall <name> -n <your-namespace>

Verification#

  • Ensure that all pods are running and the services are accessible.

  • Access the application dashboard and verify that it is functioning as expected.

Troubleshooting#

  • If you encounter any issues during the deployment process, check the Kubernetes logs for errors:

    kubectl logs <pod-name>
    
  • The PVC created during helm chart deployment will remain present until explicitly deleted, use the below command to delete:

    kubectl delete pvc <pvc-name> -n <namespace>