# Get Started

The **Multi-level Video Understanding Microservice** enables developers to create video summary from video files. This section provides step-by-step instructions to:

- Set up dependent microservices of genai model servings, including large language models(llms) and vision language models(vlms).
- Set up the microservice using a pre-built Docker image for quick deployment.
- Run predefined tasks to explore its functionality.
- Learn how to modify basic configurations to suit specific requirements.

## Prerequisites

Before you begin, ensure the following:

- **System Requirements**: Verify that your system meets the [minimum requirements](./system-requirements.md).
- **Docker Installed**: Install Docker. For installation instructions, see [Get Docker](https://docs.docker.com/get-docker/).

This guide assumes basic familiarity with Docker commands and terminal usage. If you are new to Docker, see [Docker Documentation](https://docs.docker.com/) for an introduction.

## Setup GenAI Model Servings for VLM and LLM

This microservice is designed to work effortlessly with GenAI model servings that provide OpenAI-compatible APIs. We recommend take **vLLM-IPEX** as an example, this is primarily used for inference on Intel single-GPU or multiple-GPUs, optimized for Intel® Arc™ Pro B60 Graphics.

First of all, prepare GenAIComps from Open Platform for Enterprise AI (OPEA):

```bash
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps
```

### Start model serving for VLM

**Key Configuration**

- `MAX_MODEL_LEN`: max model length, constraints to GPU memory.
- `LLM_MODEL_ID`: huggingface model id.
- `LOAD_QUANTIZATION`: model precision.
- `VLLM_PORT`: VLM model serving port.
- `ONEAPI_DEVICE_SELECTOR`: device id, use `export ONEAPI_DEVICE_SELECTOR=level_zero:[gpu_id];level_zero:[gpu_id]` to select device before excuting your command.
- `TENSOR_PARALLEL_SIZE`: tensor parallel size.

**Deployment Steps**

1. Pull the official docker image first.

```bash
docker pull intel/llm-scaler-vllm:0.10.0-b4
```

2. Export the required environment variables.

```bash
# Use image: intel/llm-scaler-vllm:0.10.0-b4
export REGISTRY=intel
export TAG=0.10.0-b4

export VIDEO_GROUP_ID=$(getent group video | awk -F: '{printf "%s\n", $3}')
export RENDER_GROUP_ID=$(getent group render | awk -F: '{printf "%s\n", $3}')

HF_HOME=${HF_HOME:=~/.cache/huggingface}
export HF_HOME

export MAX_MODEL_LEN=20000
export LLM_MODEL_ID=Qwen/Qwen2.5-VL-7B-Instruct
export LOAD_QUANTIZATION=fp8
export VLLM_PORT=41091
export ONEAPI_DEVICE_SELECTOR="level_zero:0;level_zero:1"
export TENSOR_PARALLEL_SIZE=2
```

3. Navigate to the Docker Compose directory and start the services:
```bash
cd comps/lvms/deployment/docker_compose/
docker compose up lvm-vllm-ipex-service -d
```

Then, check existence of serving:

```bash
docker logs -f lvm-vllm-ipex-service

...
INFO:     Started server process [411]
INFO:     Waiting for application startup.
INFO:     Application startup complete.

```
> Note: Please wait for a while since it takes some time to load models, especially for the first time deploying a new model. Resources will be downloaded from huggingface endpoint.

If you would like to uninstall the model serving, run the following command in the same environment where you performed the installation:
```bash
docker compose down lvm-vllm-ipex-service
```

More details can be found in [LVM Microservice with vLLM on Intel XPU](https://opea-project.github.io/latest/GenAIComps/comps/lvms/src/README_vllm_ipex.html)

### Start model serving for LLM

**Key Configuration**

- `MAX_MODEL_LEN`: max model length, constraints to GPU memory.
- `LLM_MODEL_ID`: huggingface model id.
- `LOAD_QUANTIZATION`: model precision.
- `VLLM_PORT`: LLM model serving port.
- `ONEAPI_DEVICE_SELECTOR`: device id, use `export ONEAPI_DEVICE_SELECTOR=level_zero:[gpu_id];level_zero:[gpu_id]` to select device before excuting your command.
- `TENSOR_PARALLEL_SIZE`: tensor parallel size.

**Deployment Steps**

1. Pull the official docker image first.

```bash
docker pull intel/llm-scaler-vllm:0.10.0-b4
```

2. Export the required environment variables.

```bash
# Use image: intel/llm-scaler-vllm:0.10.0-b4
export REGISTRY=intel
export TAG=0.10.0-b4

export VIDEO_GROUP_ID=$(getent group video | awk -F: '{printf "%s\n", $3}')
export RENDER_GROUP_ID=$(getent group render | awk -F: '{printf "%s\n", $3}')

HF_HOME=${HF_HOME:=~/.cache/huggingface}
export HF_HOME

export MAX_MODEL_LEN=20000
export LLM_MODEL_ID=Qwen/Qwen3-32B-AWQ
export LOAD_QUANTIZATION=awq
export VLLM_PORT=41090
export ONEAPI_DEVICE_SELECTOR="level_zero:2;level_zero:3"
export TENSOR_PARALLEL_SIZE=2
```

3. Navigate to the Docker Compose directory and start the services:
```bash
cd comps/llms/deployment/docker_compose/
docker compose -f compose_text-generation.yaml up textgen-vllm-ipex-service -d
```

Then, check existence of serving:

```bash
docker logs -f textgen-vllm-ipex-service

...
INFO:     Started server process [411]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
```

> Note: Please refer to [validated models](./Overview.md#validated-models) for the list of models that can has been verified in video summarization.

If you would like to uninstall the model serving, run the following command in the same environment where you performed the installation:
```bash
docker compose -f compose_text-generation.yaml down textgen-vllm-ipex-service
```

More details can be found in  [LLM Microservice with vLLM on Intel XPU](https://opea-project.github.io/latest/GenAIComps/comps/llms/src/text-generation/README_vllm_ipex.html)

## Quick Start with Docker

**step1.** Prepare docker image
Before lauching the service as documented below, users need to prepare the docker images:

- **Option1.** [Build the docker images](./how-to-build-from-source.md#steps-to-build)
- **Option2.** Download the prebuilt images from Docker Hub ([intel/multilevel-video-understanding](https://hub.docker.com/r/intel/multilevel-video-understanding))
   ```bash
   docker pull intel/multilevel-video-understanding:latest
   ```

Then, use the following commands to set up the `multilevel-video-understanding` microservice.

**step2.** Set up environment variables

The following environment variables can be configured:

**Basic configuration**

- `REGISTRY_URL`: Docker image registry url
- `TAG`: Docker image tag (default: latest)
- `SERVICE_PORT`: Multi-level Video Understanding Microservice port (default: 8192)
- `MAX_CONCURRENT_REQUESTS`: Max concurrent requests for this microservice (default: 6)
- `DEBUG`: Enable debug mode (default: False)

**Model configuration**

- `VLM_MODEL_NAME`: Vison-Language model(VLM), this should comply with model serving's `model` field.
- `VLM_BASE_URL`: Model serving's base url for VLM. (e.g., `http://localhost:41091/v1`)
- `LLM_MODEL_NAME`: Large Language model(LLM), this should comply with model serving's `model` field.
- `LLM_BASE_URL`: Model serving's base url for LLM. (e.g., `http://localhost:41090/v1`)

**Example of minimum required environment variables**

```bash
export REGISTRY_URL=intel/
export TAG=latest
export VLM_BASE_URL="http://<model-serving-ip-address>:41091/v1"
export LLM_BASE_URL="http://<model-serving-ip-address>:41090/v1"
export VLM_MODEL_NAME=Qwen/Qwen2.5-VL-7B-Instruct
export LLM_MODEL_NAME=Qwen/Qwen3-32B-AWQ
export SERVICE_PORT=8192
```

> **Note:**
>
> - Please remember to change `REGISTRY_URL` and `TAG` as needed.
>   - If `REGISTRY_URL` is provided, the final image name will be: `${REGISTRY_URL}/multilevel-video-understanding:${TAG}`.
>   - If `REGISTRY_URL` is not provided, the image name will be: `multilevel-video-understanding:${TAG}`
> - Make sure `VLM_MODEL_NAME` is consistent with the model used in sec. [Start model serving for VLM](#start-model-serving-for-vlm)
> - Make sure `LLM_MODEL_NAME` is consistent with the model used in sec. [Start model serving for LLM](#start-model-serving-for-llm)

**step3.** Launch the microservice

```bash
git clone https://github.com/open-edge-platform/edge-ai-libraries.git edge-ai-libraries
cd edge-ai-libraries/microservices/multilevel-video-understanding
chmod +x ./setup_docker.sh
./setup_docker.sh
```

Once the service is up, you can check the log:

```bash
$ docker ps

CONTAINER ID   IMAGE                                         PORTS                                         NAMES
6f00712bf4b6   intel/multilevel-video-understanding:latest   0.0.0.0:8192->8000/tcp, [::]:8192->8000/tcp   docker-multilevel-video-understanding-1

# the container name may change depend to your runtime
$ docker logs -f docker-multilevel-video-understanding-1

INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

```

> **Note**: Please ensure that the dependent VLM and LLM model services have been successfully set up, and the `VLM_MODEL_NAME`, `LLM_MODEL_NAME`, `VLM_BASE_URL`, `LLM_BASE_URL` variables are correctly set. Users can refer to [Setting up GenAI model services to support VLM and LLM](#setup-genai-model-servings-for-vlm-and-llm)

## Microservice Usage Examples

Below are examples of how to use the API with curl.

### Health Check

Health check endpoint.
**Returns:**  A response indicating the service status, version and a descriptive message.

```bash
curl -X GET "http://localhost:8192/v1/health"
```

### Get Available Models

Get a list of available model variants that are configured for summarization.
**Returns:** A response with the list of available models with their details and the default model

```bash
curl -X GET "http://localhost:8192/v1/models"
```

### Request video summarization

Generate a summary text from a video file to describe its content.
**Returns:** A response with the processing status and summary output

```bash
curl http://localhost:8192/v1/summary -H "Content-Type: application/json" -d '{
    "video": "https://videos.pexels.com/video-files/5992517/5992517-hd_1920_1080_30fps.mp4",
    "method": "USE_ALL_T-1",
    "processor_kwargs": {"levels": 4, "level_sizes": [1,6,8,-1], "process_fps": 1}
}'
```

Response example:

```json
{
  "status":"completed",
  "summary":"The video presents xxx",
  "job_id":"37a09a31",
  "video_name":"https://videos.pexels.com/video-files/5992517/5992517-hd_1920_1080_30fps.mp4",
  "video_duration":55.6
}
```

This API endpoint returns a video summary, job ID, and other details once the summarization is done.

## API Documentation

When running the service, you can access the Swagger UI documentation at:

```bash
http://localhost:8192/docs
```

## Manual Host Setup using Poetry

1. Clone the repository and change directory to the `multilevel-video-understanding` microservice:

   ```bash
   git clone https://github.com/open-edge-platform/edge-ai-libraries.git edge-ai-libraries
   cd edge-ai-libraries/microservices/multilevel-video-understanding
   ```

2. Install Poetry if not already installed.

   ```bash
   python3 -m venv .venv
   source .venv/bin/activate
   pip install poetry==1.8.3
   ```

3. Install dependencies:

   ```bash
   poetry lock --no-update
   poetry install
   ```
   > Note: sometimes the `poetry install` may take long time, in this case, another option to install packages could be:
   > ```bash
   > poetry export -f requirements.txt > requirements.txt
   > pip install -r requirements.txt
   > ```

4. Install video-chunking-utils from OEP/EAL source

   ```bash
   pip install ../../libraries/video-chunking-utils/
   ```

5. Set the environment variables as needed:

   ```bash
   export VLM_BASE_URL="http://<model-serving-ip-address>:41091/v1"
   export LLM_BASE_URL="http://<model-serving-ip-address>:41090/v1"
   export VLM_MODEL_NAME=Qwen/Qwen2.5-VL-7B-Instruct
   export LLM_MODEL_NAME=Qwen/Qwen3-32B-AWQ
   export SERVICE_PORT=8192
   ```

> **Note:**
> - Make sure `VLM_MODEL_NAME` is consistent with the model used in sec. [Start model serving for VLM](#start-model-serving-for-vlm)
> - Make sure `LLM_MODEL_NAME` is consistent with the model used in sec. [Start model serving for LLM](#start-model-serving-for-llm)

6. Run the service:

```bash
DEBUG=True poetry run uvicorn video_analyzer.main:app --host 0.0.0.0 --port ${SERVICE_PORT} --reload
```

<!-- ## Troubleshooting -->

## Supporting Resources

- [Overview](Overview.md)
- [API Reference](api-reference.md)
- [System Requirements](system-requirements.md)