Run With Docker Compose#

docker compose up starts audio-analyzer, text-to-speech, rag-service, kiosk-core (REST API), and kiosk-ui (Gradio interface) as containers, using the prebuilt images published on Docker Hub.

Microphone audio is captured by the browser and uploaded to kiosk-core as a WAV file. No host audio device is passed into the containers.

To rebuild the images from source instead of pulling, see Build from Source. To run kiosk-core and the UI directly on the host, see Run On the Host.

Clone#

git clone https://github.com/intel-retail/voice-enabled-interactions.git
cd voice-enabled-interactions/smart-kiosk-assistant

Pull And Start#

From smart-kiosk-assistant/:

docker compose pull
docker compose up -d

docker compose pull fetches all five images from Docker Hub:

  • intel/audio-analyzer:${RELEASE_TAG}

  • intel/text-to-speech:${RELEASE_TAG}

  • intel/rag-service:${RELEASE_TAG}

  • intel/kiosk-core:${RELEASE_TAG}

  • intel/kiosk-ui:${RELEASE_TAG}

REGISTRY and RELEASE_TAG are read from .env (defaults: REGISTRY=intel, committed RELEASE_TAG pins the current release).

This starts five containers:

Container

Port

Purpose

audio-analyzer

8010

Speech-to-text

text-to-speech

8011

Speech synthesis

rag-service

8020

Knowledge-base retrieval

kiosk-core

8012

FastAPI session API

kiosk-ui

7860

Gradio voice UI

Containers run as non-root; every image is built with UID/GID 1000:1000 and the named volumes are initialized with that ownership, so no host UID/GID configuration is required.

Verify#

docker compose ps
curl --noproxy '*' http://127.0.0.1:8012/health   # {"status":"ok"}

Open http://127.0.0.1:7860 in a browser, click the microphone, and speak your question.

Logs#

docker compose logs -f kiosk-core
docker compose logs -f kiosk-ui

Restart / Stop#

docker compose restart            # after env var change
docker compose pull && docker compose up -d   # after a new release tag
docker compose down               # teardown

Notes#

  • The default Compose wiring connects kiosk-core and kiosk-ui to the internal audio-analyzer, rag-service, and text-to-speech containers. Override these URLs only when this stack must call services outside the local Compose network.

  • See Configuration for environment variables, model selection, and inference device, and API Reference for endpoint details.