gvamotiondetect#
Performs lightweight motion detection on NV12 video frames and emits motion regions of interest (ROIs) as analytics metadata. Automatically uses a VA-API (GPU) accelerated path when VAMemory caps are negotiated; otherwise falls back to a system-memory (CPU) path. Designed for low-latency scene motion highlighting and downstream triggering without requiring a full inference model.
Pad Templates:
SINK template: 'sink'
Availability: Always
Capabilities:
video/x-raw(memory:VAMemory)
format: (string)NV12
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]
video/x-raw
format: (string)NV12
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]
SRC template: 'src'
Availability: Always
Capabilities:
video/x-raw(memory:VAMemory)
format: (string)NV12
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]
video/x-raw
format: (string)NV12
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]
Element has no clocking capabilities.
Element has no URI handling capabilities.
Pads:
SINK: 'sink'
Pad Template: 'sink'
SRC: 'src'
Pad Template: 'src'
Element Properties:
block-size : Full-resolution block size (pixels) used to build the grid for per-block motion ratio.
flags: readable, writable
Integer. Default: 64 (range 16..512)
motion-threshold : Per-block changed pixel ratio (0..1) required to flag motion in that block before temporal confirmation.
flags: readable, writable
Double. Default: 0.05
min-persistence : Frames a tracked ROI must persist (age) before being eligible for emission.
flags: readable, writable
Integer. Default: 2 (range 1..30)
max-miss : Grace frames allowed after last successful match before a tracked ROI is purged.
flags: readable, writable
Integer. Default: 1 (range 0..30)
iou-threshold : Intersection-over-Union threshold (0..1) used to match raw motion rectangles to existing tracked ROIs.
flags: readable, writable
Double. Default: 0.3
smooth-alpha : Exponential moving average smoothing factor (0..1) for ROI coordinates; higher values follow motion faster.
flags: readable, writable
Double. Default: 0.5
confirm-frames : Consecutive frames required to confirm a motion block. 1 = immediate single-frame detection (most sensitive).
flags: readable, writable
Integer. Default: 1 (range 1..10)
pixel-diff-threshold : Per-pixel absolute luma difference (1..255) applied before thresholding; lower values increase sensitivity.
flags: readable, writable
Integer. Default: 15
min-rel-area : Minimum relative frame area (0..0.25) a motion rectangle must cover to be considered (filters tiny noise boxes).
flags: readable, writable
Double. Default: 0.0005
name : The name of the element instance.
flags: readable, writable
String. Default: "gvamotiondetect0"
parent : The parent object.
flags: readable, writable
Object of type "GstObject"
Relationship between confirm-frames and min-persistence:
These two controls act at different stages and are not interchangeable:
- confirm-frames works at the raw per-block motion stage. It requires N consecutive frames of activity in the same grid block before that block contributes to a raw motion rectangle. It filters out flicker/noise early.
- min-persistence applies after rectangles are merged and tracked. A tracked ROI must be matched for N frames (its age) before it is emitted. It guards publication of unstable, short-lived motion regions.
Effect: Raising confirm-frames reduces how many raw rectangles enter tracking (front-end suppression). Raising min-persistence delays emission of tracked ROIs (back-end stabilization). You can combine them: e.g., confirm-frames=2 with min-persistence=2 yields rectangles only after two agreeing frames and then requires two tracked frames for output (at least 2 total, possibly 3 if initial confirmation overlaps). Keeping confirm-frames=1 but min-persistence>1 allows immediate raw detection yet still waits for persistence before emission.
Metadata Output:
- Each emitted motion ROI is attached as a `GstVideoRegionOfInterestMeta` with label "motion".
- A `GstAnalyticsRelationMeta` aggregates object detection metadata entries (type quark "motion") for all ROIs on the frame.
- ROI coordinates are normalized internally for analytics structures and rounded to 3 decimal places to reduce payload size.
Algorithm Summary:
1. Acquire current frame luma plane (VA surface fast path or system-memory conversion) and downscale to a working size.
2. Build motion mask: absdiff(previous, current) -> GaussianBlur -> threshold (pixel-diff-threshold) -> morphology (open + dilate).
3. Grid scan: accumulate changed pixels per block, compute ratio; mark blocks exceeding motion-threshold.
4. Temporal confirmation: if confirm-frames > 1 require consecutive active frames per block before rectangle creation.
5. Merge overlapping rectangles, track over time with IoU matching and exponential smoothing (smooth-alpha).
6. Apply persistence (min-persistence) and miss grace (max-miss), then emit stable ROIs with associated analytics metadata.
Tuning Guidelines:
- Increase `pixel-diff-threshold` to reduce noise sensitivity (e.g., minor lighting flicker). Decrease for subtle motion.
- Increase `motion-threshold` to require more changed pixels in a block (filters small localized changes).
- Raise `confirm-frames` (e.g., 2-3) to suppress transient single-frame spikes; keep at 1 for maximal responsiveness.
- Adjust `min-persistence` if you want to delay ROI publication until sustained movement is observed.
- Lower `iou-threshold` if objects move rapidly and fail to match across frames; raise to avoid merging nearby independent motions.
- Set `smooth-alpha` closer to 1.0 for minimal smoothing (snappier boxes) or lower for steadier boxes.
- Raise `min-rel-area` to suppress very small (potentially noisy) motion rectangles; lower it to allow detection of tiny/distant objects. Default 0.0005 ≈ 0.05% of frame area.
Performance Notes:
- VA-API path uses hardware surface mapping and optional hardware downscale (via `vaBlitSurface` when available) to minimize memory bandwidth.
- System-memory path performs software resize; consider reducing `block-size` to balance granularity vs CPU cost.
- Internal working resolution scales proportionally to input width when downscaling; large frames benefit more from confirmation and smoothing.
Limitations / Future Improvements:
- Only NV12 format is currently supported (both system memory and VAMemory).
- Block merging uses a simple O(n^2) combination; extremely dense motion may produce fewer, larger ROIs.
- No explicit per-ROI confidence beyond binary motion presence (confidence fixed to 1.0 in analytics metadata).
- Coordinate rounding (3 decimal places) applied for analytics metadata; raw ROI meta stores integer pixel coordinates.
```sh
Plugin Registration:
Name: gvamotiondetect
Classification: Filter/Video
Description: Automatically uses VA surface path when VAMemory caps negotiated; otherwise system memory path