# Benchmarking and Stream Density This guide covers the performance benchmarking tools included with the Store-wide Loss Prevention application. The benchmark framework measures end-to-end latency as the number of camera/scene pipelines increases, helping you determine the maximum stream density your hardware can sustain within a given latency budget. ## Prerequisites Before running benchmarks, ensure the following setup steps are completed: ```bash make update-submodules # clones performance-tools submodule make download-sample-data # downloads video to scenescape/sample_data/ make download-models # downloads models to models/ (detect_models/, reid_models/, vlm_models/) ``` ## Quick Start ### Single-Scene Benchmark Run a one-shot benchmark with a single camera scene to measure baseline latency: ```bash make benchmark ``` This target: 1. Builds (or pulls) the benchmark Docker image. 2. Starts the full stack (`make up`). 3. Runs a single-scene measurement against the configured latency target. 4. Tears down the stack on completion. ### Stream Density (Multi-Scene) Iteratively add camera scenes until end-to-end latency exceeds the target threshold: ```bash make benchmark-stream-density ``` At each iteration the framework: 1. Updates `stream_density` in `configs/zone_config.json`. 2. Re-runs `init.sh` to regenerate `.env` and the DL Streamer pipeline config. 3. Generates `docker/docker-compose.cameras.yaml` with additional RTSP camera streams (`lp-cams-N`) for each new camera. 4. Restarts `scene-import`, `lp-video` (DL Streamer), and `swlp-service`. 5. Collects latency samples from `swlp-service` docker logs and `vlm_application_metrics` files. SceneScape core services (`web`, `controller`, `broker`), `ovms-vlm`, `behavioral-analysis`, `seaweedfs`, and `alert-service` remain running throughout. The benchmark stops when latency exceeds the target or the maximum iteration count is reached. ## Configuration All benchmark parameters can be set via `make` variables or environment variables: | Parameter | Make Variable | Default | Description | |-----------|---------------|---------|-------------| | Target latency | `BENCHMARK_TARGET_LATENCY_MS` or `TARGET_LATENCY_MS` | `10000` | Latency threshold in milliseconds | | Latency metric | `BENCHMARK_LATENCY_METRIC` or `LATENCY_METRIC` | `avg` | Which statistic to compare: `avg` or `max` | | Scene increment | `BENCHMARK_SCENE_INCREMENT` or `SCENE_INCREMENT` | `1` | Number of scenes to add per iteration | | Init duration | `BENCHMARK_INIT_DURATION` | `90` | Warm-up seconds after service restart | | Stabilise duration | `BENCHMARK_STABILISE_DURATION` | `30` | Extra wait for the pipeline to stabilise before collecting metrics | | Max iterations | `BENCHMARK_MAX_ITERATIONS` | `50` | Safety cap on the number of iterations | | Min throughput ratio | `BENCHMARK_MIN_THROUGHPUT_RATIO` | `0.5` | Minimum ratio of actual-to-expected BA samples (0–1) | | Results directory | `RESULTS_PATH` | `./results` | Where JSON and CSV results are written | Example with custom parameters: ```bash make benchmark-stream-density \ TARGET_LATENCY_MS=5000 \ LATENCY_METRIC=max \ SCENE_INCREMENT=2 \ BENCHMARK_INIT_DURATION=120 \ BENCHMARK_STABILISE_DURATION=60 ``` Set `REGISTRY=false` to force a local build when running `make benchmark` or `make benchmark-stream-density`. ### Device Profile Both `make benchmark` and `make benchmark-stream-density` use the `DEVICE` parameter to select an inference device profile (same as `make up`). The profile controls the DL Streamer decode chain, detection device, and re-identification device: ```bash make benchmark DEVICE=all-gpu-cpu.env # GPU detect + CPU re-id make benchmark-stream-density DEVICE=all-gpu.env # All GPU make benchmark-stream-density DEVICE=all-cpu.env # All CPU make benchmark-stream-density DEVICE=all-npu-cpu.env # NPU detect + CPU re-id (default) ``` Available profiles are in `configs/res/`. See [SceneScape Setup — Device Profiles](./scenescape-setup.md#device-profiles) for details. ## Results and Metrics Results are written to `RESULTS_PATH` (default `./results`) in both JSON and CSV formats: ``` results/ ├── swlp_stream_density_.json ├── swlp_stream_density_.csv └── consolidated_metrics.csv # after make consolidate-metrics ``` Each result file contains per-iteration data: | Field | Description | |-------|-------------| | `num_scenes` | Number of camera scenes in this iteration | | `latency_ms` | Measured latency (avg or max, per config) | | `passed` | Whether latency was within the target | | `throughput_ratio` | Actual / expected BA samples | | `actual_samples` | BA round-trip latency samples collected | | `samples_per_scene` | Samples per scene during the collection window | | `memory_percent` | Host memory utilisation | | `cpu_percent` | Host CPU utilisation | ### Consolidate Metrics After running one or more benchmarks, consolidate all result files into a single CSV: ```bash make consolidate-metrics ``` Output: `results/consolidated_metrics.csv` ### Environment Variables The script also reads configuration from environment variables (useful in CI): | Variable | Default | Description | |----------|---------|-------------| | `TARGET_LATENCY_MS` | `30000` | Latency threshold | | `LATENCY_METRIC` | `avg` | `avg` or `max` | | `SCENE_INCREMENT` | `1` | Scenes per iteration | | `INIT_DURATION` | `90` | Warm-up seconds | | `STABILISE_DURATION` | `30` | Collection window | | `RESULTS_DIR` | `./results` | Output directory | | `MAX_ITERATIONS` | `50` | Maximum iterations | ## Interpreting Results A typical stream-density result summary looks like: ``` STREAM DENSITY RESULTS ====================================================================== Target Latency: 8000ms Max Scenes: 3 Met Target: Yes Best Latency: 1842ms @ 3 scene(s) Scenes Latency Throughput Mem % CPU % Status ------------------------------------------------------------------ 1 620 100% 45.2 23.1 ✓ PASS 2 1205 95% 52.8 41.6 ✓ PASS 3 1842 87% 61.3 58.2 ✓ PASS 4 2450 72% 68.1 72.4 ✗ FAIL ====================================================================== ``` - **Max Scenes** is the highest scene count that stayed within the latency target. - **Throughput** shows how many BA (behavioral analysis) round-trip samples were collected versus the expected count. A ratio below `min_throughput_ratio` may indicate pipeline starvation. - The benchmark exits with code `0` if the target was met, `1` otherwise. ## Troubleshooting | Issue | Resolution | |-------|------------| | `NO DATA – no latency metrics collected` | The pipeline may not have produced detections during the collection window. Increase `BENCHMARK_INIT_DURATION` or `BENCHMARK_STABILISE_DURATION`. | | Memory threshold exceeded | The host is running low on RAM. Reduce scene count or use a machine with more memory. | | `init.sh not found` | Ensure the `scenescape` directory is at `../scenescape` relative to `suspicious-activity-detection`. | | Benchmark image pull fails | Check network connectivity and registry credentials. Use `REGISTRY=false` to build locally. | | Stale scenes prevent re-import | The framework auto-cleans cloned scenes. If issues persist, run `make clean-stream-density` and retry. |