Benchmarking and Stream Density#

This guide covers the performance benchmarking tools included with the Store-wide Loss Prevention application. The benchmark framework measures end-to-end latency as the number of camera/scene pipelines increases, helping you determine the maximum stream density your hardware can sustain within a given latency budget.

Prerequisites#

Before running benchmarks, ensure the following setup steps are completed:

make update-submodules      # clones performance-tools submodule
make download-sample-data   # downloads video to scenescape/sample_data/
make download-models        # downloads models to models/ (detect_models/, reid_models/, vlm_models/)

Quick Start#

Single-Scene Benchmark#

Run a one-shot benchmark with a single camera scene to measure baseline latency:

make benchmark

This target:

Builds (or pulls) the benchmark Docker image.
Starts the full stack (make up).
Runs a single-scene measurement against the configured latency target.
Tears down the stack on completion.

Stream Density (Multi-Scene)#

Iteratively add camera scenes until end-to-end latency exceeds the target threshold:

make benchmark-stream-density

At each iteration the framework:

Updates stream_density in configs/zone_config.json.
Re-runs init.sh to regenerate .env and the DL Streamer pipeline config.
Generates docker/docker-compose.cameras.yaml with additional RTSP camera streams (lp-cams-N) for each new camera.
Restarts scene-import, lp-video (DL Streamer), and swlp-service.
Collects latency samples from swlp-service docker logs and vlm_application_metrics files.

SceneScape core services (web, controller, broker), ovms-vlm, behavioral-analysis, seaweedfs, and alert-service remain running throughout.

The benchmark stops when latency exceeds the target or the maximum iteration count is reached.

Configuration#

All benchmark parameters can be set via make variables or environment variables:

Parameter	Make Variable	Default	Description
Target latency	`BENCHMARK_TARGET_LATENCY_MS` or `TARGET_LATENCY_MS`	`10000`	Latency threshold in milliseconds
Latency metric	`BENCHMARK_LATENCY_METRIC` or `LATENCY_METRIC`	`avg`	Which statistic to compare: `avg` or `max`
Scene increment	`BENCHMARK_SCENE_INCREMENT` or `SCENE_INCREMENT`	`1`	Number of scenes to add per iteration
Init duration	`BENCHMARK_INIT_DURATION`	`90`	Warm-up seconds after service restart
Stabilise duration	`BENCHMARK_STABILISE_DURATION`	`30`	Extra wait for the pipeline to stabilise before collecting metrics
Max iterations	`BENCHMARK_MAX_ITERATIONS`	`50`	Safety cap on the number of iterations
Min throughput ratio	`BENCHMARK_MIN_THROUGHPUT_RATIO`	`0.5`	Minimum ratio of actual-to-expected BA samples (0–1)
Results directory	`RESULTS_PATH`	`./results`	Where JSON and CSV results are written

Example with custom parameters:

make benchmark-stream-density \
  TARGET_LATENCY_MS=5000 \
  LATENCY_METRIC=max \
  SCENE_INCREMENT=2 \
  BENCHMARK_INIT_DURATION=120 \
  BENCHMARK_STABILISE_DURATION=60

Set REGISTRY=false to force a local build when running make benchmark or make benchmark-stream-density.

Device Profile#

Both make benchmark and make benchmark-stream-density use the DEVICE parameter to select an inference device profile (same as make up). The profile controls the DL Streamer decode chain, detection device, and re-identification device:

make benchmark DEVICE=all-gpu-cpu.env           # GPU detect + CPU re-id
make benchmark-stream-density DEVICE=all-gpu.env # All GPU
make benchmark-stream-density DEVICE=all-cpu.env # All CPU
make benchmark-stream-density DEVICE=all-npu-cpu.env  # NPU detect + CPU re-id (default)

Available profiles are in configs/res/. See SceneScape Setup — Device Profiles for details.

Results and Metrics#

Results are written to RESULTS_PATH (default ./results) in both JSON and CSV formats:

results/
├── swlp_stream_density_<timestamp>.json
├── swlp_stream_density_<timestamp>.csv
└── consolidated_metrics.csv          # after make consolidate-metrics

Each result file contains per-iteration data:

Field	Description
`num_scenes`	Number of camera scenes in this iteration
`latency_ms`	Measured latency (avg or max, per config)
`passed`	Whether latency was within the target
`throughput_ratio`	Actual / expected BA samples
`actual_samples`	BA round-trip latency samples collected
`samples_per_scene`	Samples per scene during the collection window
`memory_percent`	Host memory utilisation
`cpu_percent`	Host CPU utilisation

Consolidate Metrics#

After running one or more benchmarks, consolidate all result files into a single CSV:

make consolidate-metrics

Output: results/consolidated_metrics.csv

Environment Variables#

The script also reads configuration from environment variables (useful in CI):

Variable	Default	Description
`TARGET_LATENCY_MS`	`30000`	Latency threshold
`LATENCY_METRIC`	`avg`	`avg` or `max`
`SCENE_INCREMENT`	`1`	Scenes per iteration
`INIT_DURATION`	`90`	Warm-up seconds
`STABILISE_DURATION`	`30`	Collection window
`RESULTS_DIR`	`./results`	Output directory
`MAX_ITERATIONS`	`50`	Maximum iterations

Interpreting Results#

A typical stream-density result summary looks like:

STREAM DENSITY RESULTS
======================================================================
  Target Latency:  8000ms
  Max Scenes:      3
  Met Target:      Yes
  Best Latency:    1842ms @ 3 scene(s)

Scenes    Latency     Throughput    Mem %     CPU %     Status
------------------------------------------------------------------
1         620         100%          45.2      23.1      ✓ PASS
2         1205        95%           52.8      41.6      ✓ PASS
3         1842        87%           61.3      58.2      ✓ PASS
4         2450        72%           68.1      72.4      ✗ FAIL
======================================================================

Max Scenes is the highest scene count that stayed within the latency target.
Throughput shows how many BA (behavioral analysis) round-trip samples were collected versus the expected count. A ratio below min_throughput_ratio may indicate pipeline starvation.
The benchmark exits with code 0 if the target was met, 1 otherwise.

Troubleshooting#

Issue	Resolution
`NO DATA – no latency metrics collected`	The pipeline may not have produced detections during the collection window. Increase `BENCHMARK_INIT_DURATION` or `BENCHMARK_STABILISE_DURATION`.
Memory threshold exceeded	The host is running low on RAM. Reduce scene count or use a machine with more memory.
`init.sh not found`	Ensure the `scenescape` directory is at `../scenescape` relative to `suspicious-activity-detection`.
Benchmark image pull fails	Check network connectivity and registry credentials. Use `REGISTRY=false` to build locally.
Stale scenes prevent re-import	The framework auto-cleans cloned scenes. If issues persist, run `make clean-stream-density` and retry.