Release Notes: Audio Analyzer#
This page tracks releases of the Audio Analyzer microservice. The most recent release is listed first; older entries are preserved for history.
Version 1.4.0#
First release of the Audio Analyzer as a self-contained, OpenAI API-compatible speech-to-text microservice with optional voice sentiment analysis, built for edge deployment on Intel hardware.
June 17, 2026
New
OpenAI-compatible transcription API (
POST /v1/audio/transcriptions) and a streaming NDJSON variant (/stream).Multi-backend ASR:
openai(PyTorch Whisper),openvino(Intel-optimized), andwhispercpp(CPU-only).Full Whisper model family supported (
tiny→large).Optional voice sentiment analysis with session-level aggregation (
openvinoorpytorchprovider).FFmpeg-based preprocessing: chunking, silence detection, optional RNNoise denoising.
Session continuation via
session_id(returned inX-Session-ID).Health (
/health) and ALSA device listing (/devices) endpoints.New User Guide doc set, including: overview, get-started, how-it-works configuration, api-reference, and troubleshooting Markdown files, plus an architecture diagram and a restructured README.md.
Improved
OpenVINO CPU/GPU acceleration on Intel hardware; models warm-loaded once per process.
Layered config (
config.yaml, env overrides viaAUDIO_ANALYZER__...) and Docker Compose deployment on port8010.Container now runs as a non-root user (UID 1000).
Known issues
The
promptform field is accepted for API compatibility but currently ignored.Compatibility with the Video Search and Summarization sample application will be added in a subsequent release.
v1.3.1#
Released as part of
release-2026.0.0.Supported features based on the requirements of the Video Search and Summarization sample application. Refer to that sample’s release notes for details on this microservice at that version.