# How to deploy with Helm This guide provides step-by-step instructions for deploying the ChatQ&A Sample Application using Helm. ## Prerequisites Before you begin, ensure that you have the following prerequisites: - Kubernetes cluster set up and running. - The cluster must support **dynamic provisioning of Persistent Volumes (PV)**. Refer to the [Kubernetes Dynamic Provisioning Guide](https://kubernetes.io/docs/concepts/storage/dynamic-provisioning/) for more details. - Install `kubectl` on your system. Refer to [Installation Guide](https://kubernetes.io/docs/tasks/tools/install-kubectl/). Ensure access to the Kubernetes cluster. - Helm installed on your system: [Installation Guide](https://helm.sh/docs/intro/install/). ## Steps to deploy with Helm Following steps should be followed to deploy ChatQ&A using Helm. You can install from source code or pull the chart from Docker hub. **_Steps 1 to 3 varies depending on if the user prefers to build or pull the Helm details._** ### Option 1: Install from Docker Hub #### Step 1: Pull the Specific Chart Use the following command to pull the Helm chart from Docker Hub: ```bash helm pull oci://registry-1.docker.io/intel/chat-question-and-answer --version ``` Refer to the release notes for details on the latest version number to use for the sample application. #### Step 2: Extract the `.tgz` File After pulling the chart, extract the `.tgz` file: ```bash tar -xvf chat-question-and-answer-.tgz ``` This will create a directory named `chat-question-and-answer` containing the chart files. Navigate to the extracted directory. ```bash cd chat-question-and-answer ``` #### Step 3: Configure the `values.yaml` File Choose the appropriate `values*.yaml` file based on the model server you want to use: - **For OVMS**: Use `values_ovms.yaml`. - **For vLLM**: Use `values_vllm.yaml`. - **For TGI**: Use `values_tgi.yaml`. Edit the selected `values*.yaml` file to set the necessary environment variables. Ensure you set the `huggingface.apiToken` and proxy settings as required. | Key | Description | Example Value | | --- | ----------- | ------------- | | `global.huggingface.apiToken` | Your Hugging Face API token | `` | | `global.POSTGRES_USER` | Give User name for PG Vector DB | `` | | `global.POSTGRES_PASSWORD` | Give pwd for PG Vector DB | `` | | `global.MINIO_ROOT_USER` | A Minio server user name | `` (user length should be at least 3) | | `global.MINIO_ROOT_PASSWORD`| A password to connect to minio server | `` (password length should be at least 8 characters) | | `global.OTLP_ENDPOINT` | OTLP endpoint | | | `global.OTLP_ENDPOINT_TRACE` | OTLP endpoint for trace | | | `global.teiEmbeddingService.enabled` | Flag to enable TEI embedding model server | `false` | | `global.ovmsEmbeddingService.enabled` | Flag to enable OVMS embedding model server | `true` | | `global.UI_NODEPORT` | Sets the static port (in the 30000–32767 range) | | | `Chatqna.name` | Name of the ChatQnA application | `chatqna` | | `Chatqna.image.repository` | image repository url | `intel/chatqna` | | `Chatqna.image.tag` | latest image tag | `1.1.2` | | `Chatqna.env.ENDPOINT_URL` | connection endpoint to model server | | | `Chatqna.env.INDEX_NAME` | index name for pgVector | `intel-rag` | | `Chatqna.env.FETCH_K` | Number of top K results to fetch | `10` | | `Chatqna.global.EMBEDDING_MODEL_NAME`| embedding model name | `BAAI/bge-small-en-v1.5`| | `Chatqna.env.PG_CONNECTION_STRING` | pgvector connection string | `postgresql+psycopg://`| | `Chatqna.env.LLM_MODEL` | model to be used with tgi/vllm/ovms | `Intel/neural-chat-7b-v3-3`| | `Chatqna.env.RERANKER_MODEL` | model to be used with tei | `BAAI/bge-reranker-base`| ### Option 2: Install from Source #### Step 1: Clone the Repository Clone the repository containing the Helm chart: ```bash git clone ``` #### Step 2: Change to the Chart Directory Navigate to the chart directory: ```bash cd /sample-applications/chat-question-and-answer/chart ``` #### Step 3: Configure the `values*.yaml` File Edit the `values*.yaml` file located in the chart directory to set the necessary environment variables. Refer to the table in **Option 1, Step 3** for the list of keys and example values. #### Step 4: Build Helm Dependencies Navigate to the chart directory and build the Helm dependencies using the following command: ```bash helm dependency build ``` ## Common Steps after configuration ### Step 5: Deploy the Helm Chart Deploy the OVMS Helm chart: ```bash helm install chatqna . -f values_ovms.yaml -n ``` **Note:** When deploying OVMS, the OVMS service is observed to take more time than other model serving due to model conversion time. Deploy the vLLM Helm chart: ```bash helm install chatqna . -f values_vllm.yaml -n ``` Deploy the TGI Helm chart: ```bash helm install chatqna . -f values_tgi.yaml -n ``` ### Step 6: Verify the Deployment Check the status of the deployed resources to ensure everything is running correctly ```bash kubectl get pods -n kubectl get services -n ``` ### Step 7: Retrieving the Service Endpoint (NodePort and NodeIP) Open the UI in a browser at http://\:\ ### Step 8: Update Helm Dependencies If any changes are made to the subcharts, update the Helm dependencies using the following command: ```bash helm dependency update ``` ### Step 9: Uninstall Helm chart To uninstall helm charts deployed, use the following command: ```bash helm uninstall -n ``` ## Verification - Ensure that all pods are running and the services are accessible. - Access the application dashboard and verify that it is functioning as expected. ## Troubleshooting - If you encounter any issues during the deployment process, check the Kubernetes logs for errors: ```bash kubectl logs ``` - The _PVC_ created during helm chart deployment will remain present until explicitly deleted, use the below command to delete: ```bash kubectl delete pvc -n ``` ## Related links - [How to Build from Source](./build-from-source.md) - [How to Test Performance](./how-to-performance.md) - [How to Benchmark](./benchmarks.md)