Edge Workloads and Benchmarks Guide#

Overview#

The Edge Workloads and Benchmarks suite validates Intel edge platform performance across four workload categories: vision AI inference, hardware-accelerated media processing, end-to-end video analytics pipelines, and generative AI. It measures throughput, latency, power consumption, and power efficiency across CPU, GPU, and NPU devices.

Use this guide after provisioning an edge node with the Infrastructure Blueprint to quantify platform performance and validate hardware acceleration readiness.

Benchmark Categories#

Category	What It Measures	Devices	Backend
Vision Benchmarks	AI model inference (detection, classification)	CPU, GPU, NPU	OpenVINO benchmark_app
Media Benchmarks	Hardware video decode throughput and stream density	GPU (VA-API)	GStreamer + VA-API
Edge AI Pipelines	End-to-end video analytics (decode + detect + track + classify)	GPU, NPU, GPU+NPU	DL Streamer
GenAI Benchmarks	LLM/VLM token generation (1st token latency, throughput)	CPU, GPU, NPU	OpenVINO GenAI

Prerequisites#

Edge Node Infrastructure Blueprint deployed.
During target system installation, set host_type=container in the config-file.
Network connectivity for model and media downloads.

Verify Hardware Readiness#

Confirm GPU and NPU are visible before proceeding:

# GPU — should list Intel render nodes
ls /dev/dri/render*

# NPU — present only on supported platforms
ls /dev/accel/accel*

# VA-API codec support
vainfo 2>/dev/null | grep -i "profile"

Setup#

Clone the Repository#

git clone https://github.com/open-edge-platform/edge-workloads-and-benchmarks.git
cd edge-workloads-and-benchmarks

Install Prerequisites and Download Collateral#

GPU and NPU drivers are already installed by the Infrastructure Blueprint provisioning step, so disable their installation by passing INCLUDE_GPU=False INCLUDE_NPU=False to make prereqs. Then download the benchmark collateral and validate the environment:

make prereqs INCLUDE_GPU=False INCLUDE_NPU=False
make collateral INCLUDE_GENAI=True
make check

The make collateral step downloads AI models and media files. Key variables:

Variable	Default	Description
`INCLUDE_GPU`	`True`	Install GPU compute drivers
`INCLUDE_NPU`	`True`	Install NPU drivers
`INCLUDE_VISION`	`True`	Download vision models (YOLOv11, ResNet-50, MobileNet-v2)
`INCLUDE_MEDIA`	`True`	Download and encode media files (H.264, H.265, 1080p, 4K)
`INCLUDE_GENAI`	`False`	Download GenAI models (requires Hugging Face token)

Note: GenAI models require significant storage for original Hugging Face weights plus INT8/INT4 quantized artifacts. After quantization, reclaim space by removing ~/.cache/huggingface/hub/ and temporary venvs in tools/genai-downloader/.

Hugging Face Token (GenAI only)#

Some GenAI models require authentication:

export HF_TOKEN=<your-hugging-face-token>

Running Benchmarks#

Vision Benchmarks#

Measures inference throughput (FPS), latency, and power efficiency for detection models (YOLOv11n/m, YOLOv5m) and classification models (ResNet-50, MobileNet-v2), all at INT8 precision.

cd workloads/vision-benchmarks && make benchmarks
cd ../..

Execution modes: tput (maximum throughput) and latency (single-inference). Batch sizes: 1, 8, 16. Supports GPU+NPU concurrent mode for aggregate platform throughput.

Media Benchmarks#

Measures hardware-accelerated video decode performance using VA-API across H.265 and H.264 codecs at 1080p and 4K, scaling from 1 to 8 parallel streams.

cd workloads/media-benchmarks && make benchmarks
cd ../..

Key metrics: decode throughput (FPS), maximum stream density at 30 FPS target, power consumption.

Edge AI Pipelines#

Measures end-to-end video analytics pipeline performance using DL Streamer. Each pipeline chains media decode, preprocessing, object detection, tracking, and classification over 1080p HEVC input.

cd workloads/edge-ai-pipelines && make benchmarks
cd ../..

Three intensity levels (Light, Medium, Heavy) with increasing model complexity. Device placement modes: GPU-only, NPU-only, GPU+NPU split, and GPU+NPU concurrent.

GenAI Benchmarks#

Measures generative AI inference for LLMs (Llama 3.2 3B, DeepSeek-R1-1.5B, Mistral 7B) and VLMs (Phi-4 Multimodal, Gemma 3 4B, MiniCPM-V 2.6) at INT8_ASYM and INT4_SYM_CW precisions.

cd workloads/genai-benchmarks && make benchmarks
cd ../..

Key metrics: 1st token latency (ms), 2nd token throughput (tokens/s), power consumption (W), and power efficiency (tokens/s/W).

Benchmark Execution Options#

Common parameters available across all workload categories:

Parameter	Description
`DRY_RUN=True`	List all test configurations without executing
`RESUME=True`	Skip tests that already have results
`DURATION=<seconds>`	Set test duration (default: 60-120s)
`POWER=True`	Enable power measurement (requires sudo)
`CORES=pcore`	Pin execution to performance cores
`CORES=ecore`	Pin execution to efficiency cores
`CORES=0-11`	Pin to specific core range
`CLEAR=True`	Remove previous results before running

Example — dry run to preview vision test matrix:

cd workloads/vision-benchmarks && make benchmarks DRY_RUN=True

Example — run media benchmarks with power measurement, resuming from prior results:

cd workloads/media-benchmarks && make benchmarks POWER=True RESUME=True

Generating Reports#

After running benchmarks, generate an interactive HTML dashboard:

make report
make serve

The report is accessible at http://localhost:8000 and includes per-model throughput and latency charts, device comparisons, power efficiency rankings, and stream density results.

Check which configurations have completed:

make status

Results are stored under each workload directory at workloads/<category>/results/ in JSON format, organized by model, device, mode, and batch size.

Cleanup#

make clean-results    # Remove benchmark results only
make clean-all        # Remove all generated content (models, media, results)

Troubleshooting#

Problem	Solution
`make check` reports missing GPU	Verify drivers: `sudo apt install intel-opencl-icd intel-media-va-driver-non-free`
NPU not detected	Check kernel module: `lsmod \| grep intel_vpu` and device nodes: `ls /dev/accel/`
GenAI download fails	Verify `HF_TOKEN` is set and has access to gated models
Low GPU throughput	Ensure no other workloads are using the GPU; check `intel_gpu_top`
Power measurement fails	`POWER=True` requires sudo access for RAPL/hwmon readings
Docker permission denied	Add user to docker group: `sudo usermod -aG docker $USER` and re-login
Insufficient storage	Run without GenAI (`INCLUDE_GENAI=False`) or remove HF cache after conversion

References#

Edge Workloads and Benchmarks Repository
OpenVINO Toolkit
DL Streamer Documentation
Container Device Interface Guide — CDI setup for GPU/NPU access
DL Streamer Pipelines Guide — Building custom pipelines
Platform Capabilities — Hardware and software stack details