# How It Works This section provides a high-level view of how the application processes audio input and integrates with a modular backend architecture. ![High-Level System Diagram](./images/education-ai-suite-smart-class-backend-service-layer.drawio.svg) ## Inputs **Audio Files** You can upload audio recordings through the *Web-based UI layer*, which supports: - Audio upload - Viewing transcription, summaries, and performance metrics - Localisation options (English/Chinese) The uploaded audio is passed to the Backend API, which acts as the gateway to the backend service layer and provides similar capabilities. **Processing** - **Audio Pre-processing** Cleans and formats audio data for downstream tasks. - **ASR Component (Automatic Speech Recognition)** Converts audio into text using integrated ASR providers: - FunASR - OpenVINO - OpenAI - **Summariser Component** Generates concise summaries of transcribed text using LLM providers: - iPexLLM - OpenVINO - **Metrics Collector** Monitors and collects: - xPU utilisation for hardware performance - LLM metrics for summarisation efficiency ## Outputs - Transcriptions and summaries can be accessed from the Web-based UI and file system. The path for file system is **///**. For example, /storage/chapter-10/ - Performance metrics (e.g., utilisation, model efficiency) are displayed for monitoring. - Localisation ensures outputs are available in multiple languages (English/Chinese). ## Learn More - [System Requirements](system-requirements.md): Check the hardware and software requirements for deploying the application. - [Get Started](get-started.md): Follow step-by-step instructions to set up the application.