# System Design Document: Document Summarization Application

## Overview

The Document Summarization Sample Application provides an end-to-end pipeline for summarizing documents using advanced AI models. It exposes a REST API and a user-friendly web UI, leveraging containerized microservices for scalability and maintainability.

## Architecture Diagram
The following figure shows the microservices required to implement the Document Summarization Sample Application.

![Technical Architecture Diagram of Document Summarization](./images/DocSum-Arch.png)

## Components

### 1. NGINX Web Server-based Reverse Proxy
- **Role:** Routes external requests to the appropriate backend service (API or UI).
- **Configuration:** `nginx.conf`
- **Port:** `8101` (external)

### 2. Gradio UI-based UI (`docsum-ui`)
- **Role:** Provides a web interface for users to upload documents and view summaries.
- **Implementation:** `ui/gradio_app.py`
- **Port:** `9998` (external, mapped to `7860` in container)
- **Depends on:** `docsum-api`

### 3. DocSum Backend Platform (`docsum-api`)
- **Role:** Handles REST API requests for document summarization and file uploads; and orchestrates the summarization pipeline.
- **Implementation:** `app/server.py`
- **Port:** `8090` (external, mapped to 8000 in container)
- **Depends on:** OVMS Model Server (for LLM inference)

### 4. LLM on OpenVINO™ Model Server (`ovms-service`)
- **Role:** Serves AI models (e.g., LLMs) for inference.
- **Configuration:** Loads models from a mounted volume.
- **Ports:** `9300` (gRPC), `8300` (REST)

## Data Flow

1. **User uploads a document** through the UI.
2. **UI sends the document** to the DocSum backend platform (`docsum-api`) through REST API.
3. **Backend processes the document** (e.g., chunking and pre-processing).
4. **Backend sends inference requests** to the OpenVINO™ model server for summarization.
5. **Summary is returned** to the backend platform, which then sends it to the UI for display.

The following figure shows the data flow:

![API call sequence](./images/DocSum-flow.png)

## Deployment

- All services are containerized and orchestrated through `docker-compose.yaml`.
- Services communicate over a shared Docker bridge network (`my_network`).
- Environment variables are used for configuration and proxy settings.


## Key Files

- `docker-compose.yaml`: Service orchestration.
- `app/server.py`: FastAPI backend.
- `ui/gradio_app.py`: Gradio UI.
- `nginx.conf`: NGINX web server configuration.

## Extensibility

- **Model Flexibility:** OpenVINO™ model server can serve different models by updating the model volume and configuration.
- **UI Customization:** Gradio UI provides rich set of capabilities to customize the UI as per user preferences.
- **API Expansion:** You can extend the FastAPI backend for more endpoints or pre and post-processing logic.

## Security and Observability
- **Security:** You can configure the NGINX web server for SSL and TLS authentication.
- **Logging:** Each service logs to stdout and stderr for Docker log aggregation.
- **Healthchecks:** OpenVINO™ model server and API services have healthchecks defined in Docker Compose tool.

## References

- README.md
- config.py (for environment/config management)