Get Started#
The Audio Analyzer microservice enables developers to create speech transcription from video files. This section provides step-by-step instructions to:
Set up the microservice using a pre-built Docker image for quick deployment.
Run predefined tasks to explore its functionality.
Learn how to modify basic configurations to suit specific requirements.
Prerequisites#
Before you begin, ensure the following:
System Requirements: Verify that your system meets the minimum requirements.
Docker Installed: Install Docker. Make sure the
dockercommand can be run withoutsudo. For installation instructions, see Get Docker.
This guide assumes basic familiarity with Docker commands and terminal usage. If you are new to Docker, see Docker Documentation for an introduction.
Configurations#
Environment Variables#
The following environment variables can be configured:
UPLOAD_DIR: Directory for uploaded files (default: /tmp/audio-analyzer/uploads)OUTPUT_DIR: Directory for transcription output (default: /tmp/audio-analyzer/transcripts)ENABLED_WHISPER_MODELS: Comma-separated list of Whisper models to enable and downloadDEFAULT_WHISPER_MODEL: Default Whisper model to use if a model name is not provided explicitly (default: tiny.en or first model from ENABLED_WHISPER_MODELS list, if tiny.en is not available)GGML_MODEL_DIR: Directory for downloading GGML models (for CPU inference)OPENVINO_MODEL_DIR: Directory for storing OpenVINO optimized models (for GPU inference)LANGUAGE: Language code for transcription (default: None, auto-detect)MAX_FILE_SIZE: Maximum allowed file size in bytes (default: 100MB)DEFAULT_DEVICE: Device to use for transcription - ‘cpu’, ‘gpu’, or ‘auto’ (default: cpu)USE_FP16: Use half-precision (FP16) for GPU inference (default: True)STORAGE_BACKEND: Storage backend to use - ‘minio’ or ‘filesystem’.
MinIO Configuration
MINIO_ENDPOINT: MinIO server endpoint (default:minio:9000in Docker setup script)MINIO_ACCESS_KEY: MinIO access key used as login usernameMINIO_SECRET_KEY: MinIO secret key used as login password
Setup the Storage backends#
The service supports two storage backends for source video files and transcript output:
MinIO : Store transcripts in a MinIO bucket. (Default value when Docker setup script is used)
Filesystem: Store transcripts on the local filesystem. The API service will not have any external storage dependency. (Default value when application runs in standalone mode.)
The Docker setup script setup_docker.sh has minio as default storage backend. You can override the default value by setting STORAGE_BACKEND environment variable:
For Minio Storage:
export STORAGE_BACKEND=minio
For Local filesystem storage:
export STORAGE_BACKEND=local
On the other hand, the host setup script setup_host.sh uses local filesystem as the only storage backend available.
MinIO integration#
The service supports MinIO object storage integration for:
Video Source: Fetch videos from a MinIO bucket instead of direct uploads
Transcript Storage: Store transcription outputs (SRT/TXT) in a MinIO bucket
MinIO Configuration#
To use MinIO integration, you need to configure the following environment variables:
# MinIO server connection
export MINIO_ACCESS_KEY=<your-minio-username>
export MINIO_SECRET_KEY=<your-minio-password>
Models Selection#
Refer to supported models for the list of models that can be used for transcription. You can specify which models to enable through the ENABLED_WHISPER_MODELS environment variable.
Quick Start#
User has following different options to start and use the application :
Build the image and run using Docker script. Docker script helps build images for application and any required dependency and deploy the application. Default storage backend used here is
miniobut can be updated to uselocalstorage backend.Use pre-built image for standalone setup. Standalone setup has no external dependency. Default and recommended storage backend:
local.Build and setup on host using setup script. Only storage backend available:
localBuild and setup on host manually. Default storage backend used is
localbut can be configured to useminiostorage backend.
Standalone Setup in Docker Container#
Set the registry and tag for the public image to be pulled.
export REGISTRY=intel/ export TAG=1.3.1
Pull public image for Audio Analyzer Microservice:
docker pull ${REGISTRY}audio-analyzer:${TAG:-latest}
Set the required environment variables:
export ENABLED_WHISPER_MODELS=small.en,tiny.en,medium.en
Set and create the directory in filesystem where transcripts will be stored:
export AUDIO_ANALYZER_DIR=~/audio_analyzer_data mkdir $AUDIO_ANALYZER_DIR
Stop any existing Audio-Analyzer container (if any):
docker stop audioanalyzer
Run the Audio-Analyzer Microservice:
# Run Audio Analyzer application container exposed on a randomly assigned port docker run --rm -d -P -v $AUDIO_ANALYZER_DIR:/data -e http_proxy -e https_proxy -e ENABLED_WHISPER_MODELS -e DEFAULT_WHISPER_MODEL --name audioanalyzer intel/audio-analyzer:latest
Access the Audio-Analyzer API in a web browser on the URL given by this command:
host=$(ip route get 1 | awk '{print $7}') port=$(docker port audioanalyzer 8000 | head -1 | cut -d ':' -f 2) echo http://${host}:${port}/docs
API Usage#
Below are examples of how to use the API on command line with curl.
Health Check#
curl "http://localhost:$port/api/v1/health"
Get Available Models#
curl "http://localhost:$port/api/v1/models"
Filesystem Storage Examples#
Upload a Video File for Transcription#
curl -X POST "http://localhost:$port/api/v1/transcriptions" \
-H "Content-Type: multipart/form-data" \
-F "file=@/path/to/your/video.mp4" \
-F "include_timestamps=true" \
-F "device=cpu" \
-F "model_name=small.en"
Get Transcripts from Local Filesystem#
Once the transcription process is completed, the transcript files will be available in the directory set by AUDIO_ANALYZER_DIR variable. We can check the transcripts as follows:
ls $AUDIO_ANALYZER_DIR/transcript
Transcription Performance and Optimization on CPU#
The service uses pywhispercpp with the following optimizations for CPU transcription:
Multithreading: Automatically uses the optimal number of threads based on your CPU cores
Parallel Processing: Utilizes multiple CPU cores for audio processing
Greedy Decoding: Faster inference by using greedy decoding instead of beam search
OpenVINO IR Models: Can download and use OpenVINO IR models for even faster CPU inference
Manual Host Setup using Poetry#
NOTE : This is an advanced setup and is recommended for development/contribution only. As an alternative method to setup on host, please see : setting up on host using setup script. When setting up on host, the default storage backend would be local filesystem. Please make sure
STORAGE_BACKENDis not overridden to minio, unless you want to explicitly use the Minio backend.
Clone the repository and change directory to the audio-analyzer microservice:
# Clone the latest on mainline git clone https://github.com/open-edge-platform/edge-ai-libraries.git edge-ai-libraries # Alternatively, Clone a specific release branch git clone https://github.com/open-edge-platform/edge-ai-libraries.git edge-ai-libraries -b <release-tag> # Access the code cd edge-ai-libraries/microservices/audio-analyzer
Install Poetry if not already installed.
pip install poetry==1.8.3
Configure poetry to create a local virtual environment.
poetry config virtualenvs.create true poetry config virtualenvs.in-project true
Install dependencies:
poetry lock --no-update poetry install
Set comma-separated list of whisper models that need to be enabled:
export ENABLED_WHISPER_MODELS=small.en,tiny.en,medium.en
Set directories on host where models will be downloaded:
export GGML_MODEL_DIR=/tmp/audio_analyzer_model/ggml export OPENVINO_MODEL_DIR=/tmp/audio_analyzer_model/openvino
Run the service:
DEBUG=True poetry run uvicorn audio_analyzer.main:app --host 0.0.0.0 --port 8000 --reload
(Optional): To run the service with Minio storage backend, make sure Minio Server is running. Please see Running a Local Minio Server. User might need to update the
MINIO_ENDPOINTenvironment variable depending on where the Minio Server is running (if not set, default value considered islocalhost:9000).export MINIO_ENDPOINT="<minio_host>:<minio_port>"
Run the Audio Analyzer application on host:
STORAGE_BACKEND=minio DEBUG=True poetry run uvicorn audio_analyzer.main:app --host 0.0.0.0 --port 8000 --reload
Running Tests#
We can run unit tests and generate coverage by running following command in the application’s directory (microservices/audio-analyzer) in the cloned repo:
poetry lock --no-update
poetry install --with dev
# set a required env var to set model name : required due to compliance issue
export ENABLED_WHISPER_MODELS=tiny.en
# Run tests
poetry run coverage run -m pytest ./tests
# Generate Coverage report
poetry run coverage report -m
API Documentation#
When running the service, you can access the Swagger UI documentation at:
http://localhost:8000/docs
Advanced Setup Options#
Manually Running a Local MinIO Server#
If you’re not using the bundled Docker Setup script setup_docker.sh and still want to use the application with Minio storage, you can manually run a local MinIO server using:
docker run -d -p 9000:9000 -p 9001:9001 --name minio \
-e MINIO_ROOT_USER=${MINIO_ACCESS_KEY} \
-e MINIO_ROOT_PASSWORD=${MINIO_SECRET_KEY} \
-v minio_data:/data \
minio/minio server /data --console-address ':9001'
You can then access the MinIO Console at http://localhost:9001 with these credentials:
Username: <MINIO_ACCESS_KEY>
Password: <MINIO_SECRET_KEY>
When to use Filesystem vs. MinIO backend#
Use Filesystem backend when (Default for standalone setup on host):
Running in a simple, single-node deployment
No need for distributed/scalable storage
No integration with other services that might need to access transcripts
Running in resource-constrained environments
Use MinIO backend when (Default for setup using Docker script):
Running in a containerized/cloud environment
Need for scalable, distributed object storage
Integration with other services that need to access transcripts
Building a clustered/distributed system
Need for better data organization and retention policies
Next Steps#
Troubleshooting#
Docker Container Fails to Start:
Run
docker logs {{container-name}}to identify the issue.Check if the required port is available.
Cannot Access the Microservice:
Confirm the container is running:
docker ps
Supporting Resources#
Overview