# Release Notes: Audio Analyzer

This page tracks releases of the Audio Analyzer microservice. The most
recent release is listed first; older entries are preserved for history.

## v1.4.0

First release of the Audio Analyzer as a self-contained,
OpenAI API-compatible speech-to-text microservice with optional voice
sentiment analysis, built for edge deployment on Intel hardware.

**New**

- OpenAI-compatible transcription API (`POST /v1/audio/transcriptions`)
  and a streaming NDJSON variant (`/stream`).
- Multi-backend ASR: `openai` (PyTorch Whisper) and `openvino`
  (Intel-optimized); `whispercpp` planned for a follow-up release.
- Full Whisper model family supported (`tiny` → `large`).
- Optional voice sentiment analysis with session-level aggregation
  (`openvino` or `pytorch` provider).
- FFmpeg-based preprocessing: chunking, silence detection, optional
  RNNoise denoising.
- Session continuation via `session_id` (returned in `X-Session-ID`).
- Health (`/health`) and ALSA device listing (`/devices`) endpoints.

**Improved**

- OpenVINO CPU/GPU acceleration on Intel hardware; models warm-loaded
  once per process.
- Layered config (`config.yaml`, env overrides via
  `AUDIO_ANALYZER__...`) and Docker Compose deployment on port `8010`.
- Container now runs as a non-root user (UID 1000).

**Known issues**

- `whispercpp` backend is wired into configuration but not yet
  enabled at runtime.
- The `prompt` form field is accepted for API compatibility but
  currently ignored.
- Compatibility with the Video Search and Summarization sample
  application will be added in a subsequent release.

## v1.3.1

- Released as part of `release-2026.0.0`.
- Supported features based on the requirements of the Video Search and
  Summarization sample application. Refer to that sample's release notes
  for details on this microservice at that version.