# System Requirements This section lists the hardware, software, and network requirements for running the Store-wide Loss Prevention application. ## Host Operating System - Ubuntu 22.04 LTS (recommended and validated). - Other recent 64-bit Linux distributions may work, but are not fully validated. ## Hardware Requirements - **CPU:** - 8 physical cores (16 threads) or more recommended. - x86_64 architecture with support for AVX2. - **System Memory (RAM):** - Minimum: 16 GB. - Recommended: 32 GB or more for smoother multi-service operation and headroom for the VLM. - **Storage:** - Minimum free disk space: 30 GB. - Recommended: 60 GB+ to accommodate Docker images, OpenVINO™ models, the VLM weights (Qwen2.5-VL is several GB), sample video, and frame storage for behavioral analysis. - **Graphics / Accelerators:** - Required: Intel CPU. - Optional (recommended for full experience): - Intel integrated or discrete GPU supported by Intel® Graphics Compute Runtime — used for person detection, re-identification, pose estimation, and VLM inference. - Intel NPU supported by the `linux-npu-driver` stack — recommended for VLM inference (see [Release Notes](../release-notes.md) for a known issue on systems without NPU). - The host must expose GPU and NPU devices to Docker, for example: - `/dev/dri` (GPU) - `/dev/accel/accel0` (NPU) - Cameras: at least one RTSP source or a sample video file replayed via the bundled `lp-video` container. ## Software Requirements - **Docker and Container Runtime:** - Docker Engine 24.x or newer. - Docker Compose v2 (integrated as `docker compose`) or compatible compose plugin. - Ability to run containers with: - Device mappings for GPU/NPU (for the swlp-service, behavioral-analysis, and DL Streamer pipeline server). - Bind mounts for sample video and generated TLS certificates. - **Python (for helper scripts and tools):** - Python 3.10 or newer recommended. - Used primarily for asset preparation scripts (`download_models`) and local tooling; application containers include their own Python runtimes. - **Git and Make:** - `git` for cloning the repository. - `make` to run provided automation targets (for example, `make demo`, `make download-models`, `make clean`). ## AI Models and Workloads The application bundles several AI workloads, each with its own models and inputs or outputs: - **Person Detection (SceneScape DL Streamer pipeline):** - **Model:** `person-detection-retail-0013` from Open Model Zoo, converted to OpenVINO IR. - **Input:** Camera frames (BGR) from the RTSP source or replayed video. - **Output:** Per-frame bounding boxes used by SceneScape's tracker. - **Target devices:** Intel CPU or GPU via OpenVINO (`DETECTION_DEVICE`). - **Person Re-Identification (SceneScape DL Streamer pipeline):** - **Model:** `person-reidentification-retail-0277` from Open Model Zoo, converted to OpenVINO IR. - **Input:** Cropped person patches from the detector. - **Output:** Embedding vectors used by SceneScape's controller to assign persistent `object_id` across cameras and time. - **Target devices:** Intel CPU or GPU via OpenVINO (`REID_DEVICE`). - **Pose Estimation (Behavioral Analysis pre-filter):** - **Model:** YOLO pose model, converted to OpenVINO IR (`/models/yolo_models/`). - **Input:** Cropped person frames from a HIGH_VALUE-zone visit. - **Output:** 2D keypoints used to detect hand-near-body or pocket-region interactions; non-suspicious frames short-circuit and emit a `no_match` result without invoking the VLM. - **Target devices:** Intel CPU or GPU via OpenVINO (`POSE_DEVICE`). - **VLM Concealment Confirmation (Behavioral Analysis):** - **Model:** `Qwen/Qwen2.5-VL-7B-Instruct` Vision Language Model (`/models/vlm_models/`). - **Input:** A small batch of cropped person frames flagged as candidates by the pose pre-filter, with a structured prompt. - **Output:** A natural-language justification, a `status` (`suspicious`, `no_match`, or `no_enough_data`), a `confidence`, and a `last_frame_ts`. Published on the `ba/results` MQTT topic. - **Target devices:** Intel CPU, GPU, or NPU via OpenVINO (`VLM_DEVICE`). NPU is recommended where available. ## Network and Proxy - **Network Access:** - Local network connectivity to access the LP REST API (`http://:8082`), the Gradio dashboard (`http://:7860`), and the SceneScape UI (`https://`). - Optional outbound internet access to download Docker base images, OpenVINO models, and Qwen2.5-VL weights (if not pre-cached). - **Proxy Support (optional):** - If your environment uses HTTP/HTTPS proxies, configure: - `HTTP_PROXY`, `HTTPS_PROXY`, `NO_PROXY` in the shell before running `make`. ## Permissions - Ability to run Docker as a user in the `docker` group or with `sudo`. - Sufficient permissions to access device nodes for GPU and NPU (typically via membership in groups such as `video` or `render`, or via explicit `devices` configuration in Docker Compose). ## Browser Requirements - Modern web browser (Chrome, Edge, or Firefox) to access the Gradio dashboard and SceneScape UI. - JavaScript enabled. These requirements are intended for development and evaluation environments. For any production-like deployment, you should also consider additional factors such as security hardening, monitoring, backup, retention of evidence frames in object storage, and resource isolation.