# Build and customize options This guide provides build from source and specific customization options provided when building or deploying the microservice. ## Manually setup environment variables The following environment variables are required for setting up the service. It is recommended to use the `run.sh` runner script to setup the variables. Refer to this section only if any of the variables need to be changed. ### Project name - **PROJECT_NAME:** Helps provide a common docker compose project name and create a common container prefix for all services involved. ### Minio related variables - **MINIO_HOST:** Host name for Minio Server. This is used to communicate with Minio Server by Data Store service inside container. - **MINIO_API_PORT:** Port on which Minio Server's API service runs inside container. - **MINIO_API_HOST_PORT**: Port on which we want to access Minio server's API service outside container i.e. on host. - **MINIO_CONSOLE_PORT:** Port on which we want MINIO UI Console to run inside container. - **MINIO_CONSOLE_HOST_PORT:** Port on which we want to access MINIO UI Console on host machine. - **MINIO_MOUNT_PATH:** Mount point for Minio server objects storage on host machine. This helps persist objects stored on Minio server. - **MINIO_ROOT_USER:** Username for MINIO Server. This is required while accessing Minio UI Console. This needs to be overridden by setting `MINIO_PASSWD` variable on shell, if not using the default value. - **MINIO_ROOT_PASSWORD:** Password for MINIO Server. This is required while accessing Minio UI Console. This needs to be overridden by setting `MINIO_USER` variable on shell, if not using the default value. ### Embedding service related variables Currently, TEI is used to host the embedding model. The following variables are required as a result. - **TEI_HOST:** Host IP address or service name for TEI Embedding service to help other services connect to it. - **TEI_HOST_PORT:** Port on host machine where we want to access TEI embedding Service outside container. - **EMBEDDING_ENDPOINT_URL:** TEI Embedding service API endpoint URL where it serves the model. This endpoint is used by other services to get results from embedding model server. - **TEI_EMBEDDING_MODEL_NAME:** This provides the name model served by TEI embedding service. ### PGVector DB related variables - **PGVECTOR_HOST:** Host IP address or service name for PGVector DB service to help other services connect to it. - **PGVECTOR_USER:** User name for PG Vector DB. This needs to overridden by setting `PGDB_USER` on shell, if not using the default value. - **PGVECTOR_PASSWORD:** Password for PG Vector DB. This needs to overridden by setting `PGDB_PASSWD` on shell, if not using the default value. - **PGVECTOR_DBNAME:** Database name for PG Vector DB which contains different tables to store embeddings. This needs to overridden by setting `PGDB_NAME` on shell, if not using the default value. - **INDEX_NAME:** Name of index used for creating embeddings. This is referenced in several PG Vector DB queries and defines a particular context for retrieval. This needs to be overridden by setting `PGDB_INDEX` on shell, if not using the default value. - **PG_CONNECTION_STRING:** This is the connection string derived from previous set values for PG Vector DB. This is used by other services to connect to the databases. Caution: override it only if you are aware of what you are doing. ### Secrets and token variables - **HUGGINGFACEHUB_API_TOKEN:** This is the token required for running Huggingface based services and models. **It is mandatory to set it and `run.sh` script.** To set it, export `HUGGINGFACEHUB_API_TOKEN` variable from shell. ## Build from source If you want to build the dataprep image locally instead of using pre-built images: ```bash # Build with default tag (uses current date in YYYYMMDD format) source ./run.sh --build dataprep # Build with custom tag source ./run.sh --build dataprep my-registry/my-dataprep:v1.0 ``` **Note**: All built images are automatically labeled for easy management and cleanup. ## Common Customizations Customization options are currently provided in the context of the sample application. Refer to [Chat QnA sample application](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/chat-question-and-answer/index.html) for details like Helm and customization options.