Benchmarking and Stream Density#

This guide covers the performance benchmarking tools included with the Store-wide Loss Prevention application. The benchmark framework measures end-to-end latency as the number of camera/scene pipelines increases, helping you determine the maximum stream density your hardware can sustain within a given latency budget.

Prerequisites#

Before running benchmarks, ensure the following setup steps are completed:

make update-submodules      # clones performance-tools submodule
make download-sample-data   # downloads video to scenescape/sample_data/
make download-models        # downloads models to models/ (detect_models/, reid_models/, vlm_models/)

Quick Start#

Single-Scene Benchmark#

Run a one-shot benchmark with a single camera scene to measure baseline latency:

make benchmark

This target:

  1. Builds (or pulls) the benchmark Docker image.

  2. Starts the full stack (make up).

  3. Runs a single-scene measurement against the configured latency target.

  4. Tears down the stack on completion.

Stream Density (Multi-Scene)#

Iteratively add camera scenes until end-to-end latency exceeds the target threshold:

make benchmark-stream-density

At each iteration the framework:

  1. Updates stream_density in configs/zone_config.json.

  2. Re-runs init.sh to regenerate .env and the DL Streamer pipeline config.

  3. Generates docker/docker-compose.cameras.yaml with additional RTSP camera streams (lp-cams-N) for each new camera.

  4. Restarts scene-import, lp-video (DL Streamer), and swlp-service.

  5. Collects latency samples from swlp-service docker logs and vlm_application_metrics files.

SceneScape core services (web, controller, broker), ovms-vlm, behavioral-analysis, seaweedfs, and alert-service remain running throughout.

The benchmark stops when latency exceeds the target or the maximum iteration count is reached.

Configuration#

All benchmark parameters can be set via make variables or environment variables:

Parameter

Make Variable

Default

Description

Target latency

BENCHMARK_TARGET_LATENCY_MS or TARGET_LATENCY_MS

10000

Latency threshold in milliseconds

Latency metric

BENCHMARK_LATENCY_METRIC or LATENCY_METRIC

avg

Which statistic to compare: avg or max

Scene increment

BENCHMARK_SCENE_INCREMENT or SCENE_INCREMENT

1

Number of scenes to add per iteration

Init duration

BENCHMARK_INIT_DURATION

90

Warm-up seconds after service restart

Stabilise duration

BENCHMARK_STABILISE_DURATION

30

Extra wait for the pipeline to stabilise before collecting metrics

Max iterations

BENCHMARK_MAX_ITERATIONS

50

Safety cap on the number of iterations

Min throughput ratio

BENCHMARK_MIN_THROUGHPUT_RATIO

0.5

Minimum ratio of actual-to-expected BA samples (0–1)

Results directory

RESULTS_PATH

./results

Where JSON and CSV results are written

Example with custom parameters:

make benchmark-stream-density \
  TARGET_LATENCY_MS=5000 \
  LATENCY_METRIC=max \
  SCENE_INCREMENT=2 \
  BENCHMARK_INIT_DURATION=120 \
  BENCHMARK_STABILISE_DURATION=60

Set REGISTRY=false to force a local build when running make benchmark or make benchmark-stream-density.

Device Profile#

Both make benchmark and make benchmark-stream-density use the DEVICE parameter to select an inference device profile (same as make up). The profile controls the DL Streamer decode chain, detection device, and re-identification device:

make benchmark DEVICE=all-gpu-cpu.env           # GPU detect + CPU re-id
make benchmark-stream-density DEVICE=all-gpu.env # All GPU
make benchmark-stream-density DEVICE=all-cpu.env # All CPU
make benchmark-stream-density DEVICE=all-npu-cpu.env  # NPU detect + CPU re-id (default)

Available profiles are in configs/res/. See SceneScape Setup — Device Profiles for details.

Results and Metrics#

Results are written to RESULTS_PATH (default ./results) in both JSON and CSV formats:

results/
├── swlp_stream_density_<timestamp>.json
├── swlp_stream_density_<timestamp>.csv
└── consolidated_metrics.csv          # after make consolidate-metrics

Each result file contains per-iteration data:

Field

Description

num_scenes

Number of camera scenes in this iteration

latency_ms

Measured latency (avg or max, per config)

passed

Whether latency was within the target

throughput_ratio

Actual / expected BA samples

actual_samples

BA round-trip latency samples collected

samples_per_scene

Samples per scene during the collection window

memory_percent

Host memory utilisation

cpu_percent

Host CPU utilisation

Consolidate Metrics#

After running one or more benchmarks, consolidate all result files into a single CSV:

make consolidate-metrics

Output: results/consolidated_metrics.csv

Environment Variables#

The script also reads configuration from environment variables (useful in CI):

Variable

Default

Description

TARGET_LATENCY_MS

30000

Latency threshold

LATENCY_METRIC

avg

avg or max

SCENE_INCREMENT

1

Scenes per iteration

INIT_DURATION

90

Warm-up seconds

STABILISE_DURATION

30

Collection window

RESULTS_DIR

./results

Output directory

MAX_ITERATIONS

50

Maximum iterations

Interpreting Results#

A typical stream-density result summary looks like:

STREAM DENSITY RESULTS
======================================================================
  Target Latency:  8000ms
  Max Scenes:      3
  Met Target:      Yes
  Best Latency:    1842ms @ 3 scene(s)

Scenes    Latency     Throughput    Mem %     CPU %     Status
------------------------------------------------------------------
1         620         100%          45.2      23.1      ✓ PASS
2         1205        95%           52.8      41.6      ✓ PASS
3         1842        87%           61.3      58.2      ✓ PASS
4         2450        72%           68.1      72.4      ✗ FAIL
======================================================================
  • Max Scenes is the highest scene count that stayed within the latency target.

  • Throughput shows how many BA (behavioral analysis) round-trip samples were collected versus the expected count. A ratio below min_throughput_ratio may indicate pipeline starvation.

  • The benchmark exits with code 0 if the target was met, 1 otherwise.

Troubleshooting#

Issue

Resolution

NO DATA no latency metrics collected

The pipeline may not have produced detections during the collection window. Increase BENCHMARK_INIT_DURATION or BENCHMARK_STABILISE_DURATION.

Memory threshold exceeded

The host is running low on RAM. Reduce scene count or use a machine with more memory.

init.sh not found

Ensure the scenescape directory is at ../scenescape relative to suspicious-activity-detection.

Benchmark image pull fails

Check network connectivity and registry credentials. Use REGISTRY=false to build locally.

Stale scenes prevent re-import

The framework auto-cleans cloned scenes. If issues persist, run make clean-stream-density and retry.