Cluster Analytics Service#

The Cluster Analytics service provides advanced object clustering and movement analysis capabilities for Intel® SceneScape using DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm combined with geometric shape detection and velocity pattern classification.

This service processes real-time object detection data from Intel® SceneScape scenes, applies machine learning-based clustering algorithms, and provides comprehensive analytics including:

  • Spatial Clustering: Groups objects by proximity using DBSCAN algorithm with user-configurable parameters

  • Cluster Tracking: Tracks clusters across frames with state-based lifecycle management (NEW → ACTIVE → STABLE → FADING → LOST)

  • Shape Analysis: Detects geometric patterns (circle, rectangle, line, irregular) with size measurements

  • Velocity Analysis: Classifies movement patterns and tracks cluster dynamics

Deployment#

Build from Source#

Alternatively, see how to Build from Source.

Architecture#

Note: Diagrams are currently best viewed in light color mode.

Data Flow Diagram#

        sequenceDiagram

    participant APP as Applications
    participant CA as Cluster Analytics
    participant MQTT as MQTT Broker
    participant SC as Scene Controller


    MQTT->>SC: Detections metadata
    Note over SC: Base analytics
    SC->>MQTT: Objects metadata
    MQTT->>CA: Objects metadata

    Note over CA: User-configurable DBSCAN clustering
    Note over CA: Cluster's shape and velocity analysis

    CA->>MQTT: Optimized clusters metadata
    Note over APP: Real-time cluster insights
    MQTT->>APP:
    

DBSCAN Clustering Configuration#

User-Configurable Parameters#

The config.json file allows customization of DBSCAN clustering parameters:

  • eps - Maximum distance (in meters) between objects to be considered in the same cluster

  • min_samples - Minimum number of objects required to form a cluster

These parameters can be configured globally (default) or per object category.

Configuration File Structure#

The service uses a config.json file located in the config/ directory:

{
  "dbscan": {
    "default": {
      "eps": 1,
      "min_samples": 3
    },
    "category_specific": {
      "person": {
        "eps": 2,
        "min_samples": 2
      },
      "vehicle": {
        "eps": 4.0,
        "min_samples": 2
      },
      "bicycle": {
        "eps": 1.5,
        "min_samples": 2
      },
      "motorcycle": {
        "eps": 2.5,
        "min_samples": 2
      },
      "truck": {
        "eps": 5.0,
        "min_samples": 2
      },
      "bus": {
        "eps": 6.0,
        "min_samples": 2
      }
    }
  }
}

Parameter Descriptions#

  • default: Fallback parameters for object categories not explicitly configured

  • category_specific: Per-category parameters optimized for different object types:

    • person - Optimized for people clustering (social distancing, queues)

    • vehicle - Optimized for vehicle parking, traffic clusters

    • bicycle - Optimized for bike racks, group riding

    • motorcycle - Moderate spacing for motorcycle clusters

    • truck - Large vehicle spacing requirements

    • bus - Bus stops, depot formations

Shape Detection and Analysis#

  • ML-based Shape Classification: Detects geometric patterns using feature extraction

  • Size Calculations: Provides precise measurements for each detected shape type

  • Supported Shapes:

    • Circle: radius, diameter, area, circumference

    • Rectangle: width, height, area, perimeter, corner points

    • Line: length, endpoints, width spread

    • Irregular: bounding box dimensions, point spread

Shape Detection Logic#

        flowchart TD
    A[Cluster Points Input] --> B{Sufficient Points?}
    B -->|< 3 points| C[Insufficient Points]
    B -->|≥ 3 points| D[Calculate Features]

    D --> E[Extract Distance and Angle Features]
    E --> F[Calculate Centroid]
    F --> G[Measure Distance Variance]

    G --> H{Distance Variance < 0.5?}
    H -->|Yes| I[Circle Formation]
    H -->|No| J{Exactly 4 Points?}

    J -->|Yes| K[Check Quadrant Distribution]
    K --> L{≥ 3 Quadrants?}
    L -->|Yes| M[Rectangle Formation]
    L -->|No| N[Continue Analysis]

    J -->|No| O{≥ 5 Points?}
    O -->|Yes| P[Analyze Angle Distribution]
    P --> Q{Uniform Distribution?}
    Q -->|Yes| R[Large Circle Formation]
    Q -->|No| S[Check Linear Formation]

    S --> T{Low Triangle Areas?}
    T -->|Yes| U[Line Formation]
    T -->|No| V[Irregular Shape]

    O -->|No| N
    N --> S

    %% Shape calculations
    I --> I1[Calculate: radius, diameter, area, circumference]
    M --> M1[Calculate: width, height, area, perimeter, corners]
    R --> R1[Calculate: radius, diameter, area, circumference]
    U --> U1[Calculate: length, endpoints, width spread]
    V --> V1[Calculate: bounding box, point spread]
    

Velocity Analysis and Movement Patterns#

  • Movement Classification: 6 distinct movement patterns

  • Velocity Statistics: Comprehensive speed and direction analysis

  • Pattern Types:

    • stationary - Objects with minimal movement

    • coordinated_parallel - Synchronized movement in same direction

    • converging - Objects moving toward cluster center

    • diverging - Objects moving away from cluster center

    • loosely_coordinated - Some coordination but not highly synchronized

    • chaotic - Random or unpredictable movement patterns

Velocity Analysis Logic#

        graph TD
    A[Velocity Analysis] --> B{Speed Check}
    B -->|< 0.1 m/s| C[Stationary]
    B -->|> 0.1 m/s| D{Coherence Check}
    D -->|High Coherence| E[Coordinated Parallel]
    D -->|Low Coherence| F{Direction Analysis}
    F -->|Toward Center| G[Converging]
    F -->|Away from Center| H[Diverging]
    F -->|Mixed| I[Chaotic]
    

Category-Specific Clustering#

The serviceoptimizes DBSCAN parameters based on object categories, providing more accurate clustering for different object types:

Benefits#

  • Optimized Parameters: Each object type uses clustering parameters optimized for its spatial characteristics

  • Better Accuracy: Improved clustering accuracy by considering object-specific grouping behaviors

  • Automatic Selection: Parameters are selected based on detected object category

  • Fallback Support: Unknown categories use sensible default parameters

Category Optimization Examples#

Category

eps (meters)

min_samples

Rationale

person

2.0

2

Social distancing, queue formations

vehicle

4.0

2

Parking lots, traffic clusters

bicycle

1.5

2

Bike racks, tight group riding

motorcycle

2.5

2

Moderate spacing for motorcycle clusters

truck

5.0

2

Large vehicle spacing requirements

bus

6.0

2

Bus stops, depot formations

default

1.0

3

Fallback for unknown categories

Usage in Analysis#

The service automatically applies appropriate parameters when processing each object category, with user customizations taking precedence:

# Dynamic parameter selection with user overrides
for category, objects in objects_by_category.items():
    # Get user-configured parameters for this scene and category
    dbscan_params = self.get_dbscan_params_for_category(category, scene_id)
    clustering = DBSCAN(eps=dbscan_params['eps'],
                       min_samples=dbscan_params['min_samples'])

Cluster Tracking System#

The service includes advanced temporal tracking with state transitions and confidence scoring. These parameters are currently hardcoded constants in the implementation and are not user-configurable through config.json.

State Transition Parameters (Hardcoded)#

Parameter

Value

Description

FRAMES_TO_ACTIVATE

3

Frames needed to transition NEW → ACTIVE

FRAMES_TO_STABLE

20

Frames needed for ACTIVE → STABLE

FRAMES_TO_FADE

15

Missed frames before FADING state

FRAMES_TO_LOST

10

Missed frames before LOST state

Confidence Parameters (Hardcoded)#

Parameter

Value

Description

INITIAL_CONFIDENCE

0.5

Starting confidence for new clusters

ACTIVATION_THRESHOLD

0.6

Confidence needed for activation

STABILITY_THRESHOLD

0.7

Confidence needed for stable state

CONFIDENCE_MISS_PENALTY

0.1

Confidence penalty per missed frame

CONFIDENCE_MAX_MISS_PENALTY

0.5

Maximum cumulative miss penalty

CONFIDENCE_LONGEVITY_BONUS_MAX

0.2

Maximum bonus for long-term tracking

CONFIDENCE_LONGEVITY_FRAMES

100

Frames to reach max longevity bonus

Archival Parameters (Hardcoded)#

Parameter

Value

Description

ARCHIVE_TIME_THRESHOLD

5.0

Seconds before archiving lost clusters

MAX_ARCHIVED_CLUSTERS

50

Maximum number of archived clusters

Cluster Lifecycle States#

State

Description

Transition Trigger

NEW

Just detected, awaiting confirmation

Initial detection

ACTIVE

Confirmed and consistently detected

3+ consecutive detections, confidence >0.6

STABLE

Long-term stable presence

20+ frames detected, stability >0.7

FADING

Recently missed detections

15+ consecutive missed frames

LOST

Not detected for extended period

10+ consecutive missed frames

Confidence Calculation#

Cluster tracking confidence is calculated using:

# Base confidence from detection ratio
base_confidence = frames_detected / total_frames

# Penalty for recent misses
miss_penalty = min(frames_missed * 0.1, 0.5)

# Bonus for long-term tracking
longevity_bonus = min(frames_detected / 100, 0.2)

# Final confidence (clamped 0-1)
confidence = clamp(base_confidence - miss_penalty + longevity_bonus, 0.0, 1.0)

WebUI Features and Real-time Visualization#

The integrated WebUI provides a comprehensive interface for cluster analysis monitoring and configuration:

Interactive Visualization#

  • Real-time Canvas: Live updating visualization of objects and clusters

  • Pan and Zoom: Navigate through scene data with mouse controls

  • Object Display: Individual objects colored by cluster assignment

  • Cluster Shapes: Visual representation of detected cluster geometries

  • Movement Vectors: Optional display of cluster movement with adjustable scaling

  • Auto-fit: Automatic view adjustment to focus on current scene data

Dynamic Parameter Configuration#

  • Per-Category Controls: Independent parameter adjustment for each object category

  • Real-time Updates: Changes apply immediately with automatic re-clustering

  • Scene-Specific Settings: Each scene maintains its own parameter configuration

  • Reset to Defaults: Quick restoration of default parameters per category

  • Visual Feedback: Immediate visualization of parameter change effects

Scene Management#

  • Multi-Scene Support: Switch between available scenes dynamically

  • Auto-Discovery: Scenes are automatically discovered from MQTT traffic

  • Current Data Focus: Always displays current state without historic accumulation

  • Object Count Display: Real-time object and cluster statistics

Advanced Controls#

  • Refresh Rate: Configurable from real-time to custom intervals

  • Movement Vector Scaling: Adjustable visualization scale for velocity vectors

  • Connection Status: Live MQTT connection monitoring

  • Parameter Validation: Intelligent validation based on actual scene data

Insufficient Points Handling#

  • Individual Object Coloring: Objects are colored by category when clusters cannot be formed

  • Clear Messaging: Visual indication when clustering is not possible

  • Dynamic Thresholds: Uses user-configured min_samples rather than global defaults

MQTT Topics and Data Flow#

Input Topics#

  • Topic: scenescape/regulated/scene/{scene_id}

  • Purpose: Receives object detection data from Intel® SceneScape scenes

  • Format: JSON with objects array and scene metadata

  • Contains: Scene name, timestamp, object detections with world coordinates

Output Topics#

  • Topic: scenescape/analytics/clusters/{scene_id}

  • Purpose: Publishes cluster analysis results

  • QoS: 1 (at least once delivery)

  • Optimized Structure: Contains only cluster data without redundant scene metadata

Topic Structure Changes#

Recent Optimization: Scene identification is now derived from topic structure rather than payload content:

  • Scene ID: Extracted from topic path ({scene_id} component)

  • Scene Name: Retrieved from DATA_REGULATED topic

  • Cluster Data: Published to ANALYTICS_CLUSTERS contains only analysis results

Output Data Structure#

The Cluster Analytics service publishes optimized cluster metadata in batch format. Note: Scene identification is extracted from topic structure, not payload content.

Cluster Batch Format#

{
  "scene_id": "3bc091c7-e449-46a0-9540-29c499bca18c",
  "scene_name": "Retail",
  "timestamp": "2025-10-21T09:16:41.377Z",
  "total_clusters": 2,
  "clusters": [
    {
      "cluster_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "category": "person",
      "objects_in_cluster": 8,
      "cluster_center": {
        "x": 4.291512867202579,
        "y": 4.934464049998539
      },
      "shape_analysis": {
        "shape": "circle",
        "size": {
          "radius": 0.38788961696255303,
          "diameter": 0.7757792339251061,
          "area": 0.4726788625738194,
          "circumference": 2.437182342106631
        }
      },
      "velocity_analysis": {
        "movement_type": "chaotic",
        "average_velocity": [-0.19217192568910546, -0.0763952946379476, 0.0],
        "velocity_magnitude": 0.20680012104899237,
        "movement_direction_degrees": -158.32038869788497,
        "velocity_coherence": 0.0
      },
      "object_ids": [
        "69de7c1c-21da-45bc-ae45-2f1d3d16d5b2",
        "5baec5fa-c961-4dc0-a254-f1f614292619",
        "bf1923d8-ac12-4042-9e76-9b57b351efcb",
        "e6333708-3793-4e44-9b29-e1b7e0e7977c",
        "d9b6d6a9-d390-47a4-a9b8-95af121103ca",
        "9be324af-c0a5-4495-bae6-33d251e88366",
        "166ba387-9b4e-406d-b236-a30bb274a800",
        "71a1b1f6-8e14-4a22-a656-011fa4405c43"
      ],
      "dbscan_params": {
        "eps": 0.5,
        "min_samples": 3,
        "category": "person"
      },
      "tracking": {
        "tracking_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
        "state": "active",
        "confidence": 0.875,
        "stability_score": 0.623,
        "frames_detected": 15,
        "frames_missed": 0,
        "age_seconds": 2.5,
        "time_since_last_seen": 0.03,
        "first_seen": 1729501599.234,
        "last_seen": 1729501601.734,
        "predicted_position": {
          "x": 4.32,
          "y": 4.91
        }
      }
    }
  ],
  "summary": {
    "categories": ["person"],
    "total_objects_in_clusters": 8
  },
  "tracking_statistics": {
    "active_clusters": 2,
    "archived_clusters": 5,
    "clusters_by_state": {
      "new": 0,
      "active": 1,
      "stable": 1,
      "fading": 0,
      "lost": 0
    },
    "tracked_scenes": 2,
    "tracked_categories": 1
  }
}

Field Descriptions#

Batch-Level Fields#

Field

Type

Description

scene_id

String

Unique scene identifier (UUID)

scene_name

String

Human-readable scene name

timestamp

String

ISO 8601 timestamp when clusters were detected

total_clusters

Integer

Total number of clusters in this batch

clusters

Array

Array of individual cluster objects

summary.categories

Array

List of object categories that formed clusters

summary.total_objects_in_clusters

Integer

Total objects across all clusters

tracking_statistics

Object

Global tracking system statistics

Individual Cluster Fields#

Field

Type

Description

cluster_id

String

Unique persistent cluster UUID

category

String

Object detection category (person, vehicle, etc.)

objects_in_cluster

Integer

Number of objects forming the cluster

object_ids

Array

List of object UUIDs that form this cluster

dbscan_params

Object

User-configured DBSCAN parameters used

tracking

Object

Temporal tracking metadata (see below)

Spatial Information#

Field

Type

Description

cluster_center.x

Float

X coordinate of cluster centroid (world coordinates)

cluster_center.y

Float

Y coordinate of cluster centroid (world coordinates)

Shape Analysis#

Field

Type

Description

shape_analysis.shape

String

Detected shape type: circle, rectangle, line, irregular

shape_analysis.size

Object

Shape-specific measurements (varies by shape type)

Shape-Specific Size Fields#

Circle:

  • radius - Circle radius in meters

  • diameter - Circle diameter in meters

  • area - Circle area in square meters

  • circumference - Circle circumference in meters

Rectangle:

  • width - Rectangle width in meters

  • height - Rectangle height in meters

  • area - Rectangle area in square meters

  • perimeter - Rectangle perimeter in meters

  • corner_points - Array of [x,y] corner coordinates

Line:

  • length - Line length in meters

  • endpoints - Array of two [x,y] endpoint coordinates

  • width_spread - Standard deviation of perpendicular distances

Irregular:

  • bounding_width - Bounding box width in meters

  • bounding_height - Bounding box height in meters

  • bounding_area - Bounding box area in square meters

  • point_spread - Standard deviation of distances from centroid

Velocity Analysis#

Field

Type

Description

movement_type

String

Classified movement pattern

average_velocity

Array[Float]

[vx, vy, vz] average velocity vector in m/s

velocity_magnitude

Float

Average speed magnitude in m/s

movement_direction_degrees

Float

Movement direction in degrees (-180 to 180)

velocity_coherence

Float

Movement synchronization measure (0-1)

Tracking Metadata#

Field

Type

Description

tracking.tracking_id

String

Persistent cluster UUID (same as cluster_id)

tracking.state

String

Current lifecycle state (new/active/stable/fading/lost)

tracking.confidence

Float

Tracking confidence score (0-1)

tracking.stability_score

Float

Cluster stability metric (0-1)

tracking.frames_detected

Integer

Total frames where cluster was detected

tracking.frames_missed

Integer

Consecutive frames where cluster was not detected

tracking.age_seconds

Float

Time since first detection (seconds)

tracking.time_since_last_seen

Float

Time since last detection (seconds)

tracking.first_seen

Float

Unix timestamp of first detection

tracking.last_seen

Float

Unix timestamp of last detection

tracking.predicted_position.x

Float

Predicted X coordinate for next frame

tracking.predicted_position.y

Float

Predicted Y coordinate for next frame

Tracking Statistics#

Field

Type

Description

tracking_statistics.active_clusters

Integer

Total active clusters across all scenes

tracking_statistics.archived_clusters

Integer

Total archived (lost) clusters

tracking_statistics.clusters_by_state

Object

Count of clusters in each lifecycle state

tracking_statistics.tracked_scenes

Integer

Number of scenes with active clusters

tracking_statistics.tracked_categories

Integer

Number of object categories being tracked

Movement Pattern Classifications#

Pattern

Description

Criteria

stationary

Minimal movement

Average speed < 0.1 m/s

coordinated_parallel

Synchronized movement

Velocity coherence > 0.3

converging

Moving toward center

>60% objects moving toward cluster center

diverging

Moving away from center

>60% objects moving away from cluster center

loosely_coordinated

Some coordination

Velocity coherence 0.2-0.3

chaotic

Random movement

Low velocity coherence, mixed directions

Administrative Fields#

Field

Type

Description

object_ids

Array[String]

List of individual object IDs in the cluster

dbscan_params.eps

Float

DBSCAN epsilon parameter used for this category

dbscan_params.min_samples

Integer

DBSCAN minimum samples parameter used for this category

dbscan_params.category

String

Object category for which parameters were optimized

Production Data Analysis#

Real Deployment Performance#

Based on actual production deployment on broker.scenescape.intel.com:

  • Active Scenes: “Queuing” (302cf49a-97ec-402d-a324-c5077b280b7b), “Retail” (3bc091c7-e449-46a0-9540-29c499bca18c)

  • Object Volume: 62 person objects per frame in busy queuing scenarios

  • Cluster Formation: Typically 2 clusters formed (42-43 objects in main cluster, 4 objects in secondary cluster)

  • Noise Points: 15-17 unclustered objects (24-27% noise ratio)

  • Shape Patterns: 100% circle formations observed in production

  • Movement Types: Mix of “chaotic” (main clusters) and “stationary” (small clusters)

Performance Characteristics#

  • Processing Speed: Real-time analysis of 60+ objects per frame

  • Network Connectivity: Reliable MQTT connectivity to production broker

  • Shape Detection: Consistent circle detection with radius measurements 0.16-0.87 meters

  • Velocity Analysis: Accurate movement classification with coherence measurements

Usage Examples#

Real-time Monitoring#

Subscribe to the ANALYTICS_CLUSTERS topic to receive live cluster updates:

mosquitto_sub -h broker.scenescape.intel.com -t "scenescape/analytics/clusters/+" -v

Processing Cluster Data#

Example Python code to process cluster metadata with tracking information:

import json
import paho.mqtt.client as mqtt

def on_message(client, userdata, message):
    try:
        cluster_batch = json.loads(message.payload.decode())

        scene_name = cluster_batch['scene_name']
        scene_id = cluster_batch['scene_id']
        total_clusters = cluster_batch['total_clusters']

        print(f"\n=== Scene: {scene_name} ({scene_id}) ===")
        print(f"Total Clusters: {total_clusters}")

        # Process tracking statistics
        stats = cluster_batch.get('tracking_statistics', {})
        print(f"\nTracking Statistics:")
        print(f"  Active Clusters: {stats.get('active_clusters', 0)}")
        print(f"  Archived Clusters: {stats.get('archived_clusters', 0)}")

        state_counts = stats.get('clusters_by_state', {})
        print(f"  States: {state_counts}")

        # Process individual clusters
        for cluster in cluster_batch['clusters']:
            cluster_id = cluster['cluster_id']
            category = cluster['category']
            object_count = cluster['objects_in_cluster']

            # Tracking information
            tracking = cluster['tracking']
            state = tracking['state']
            confidence = tracking['confidence']
            stability = tracking['stability_score']
            age_seconds = tracking['age_seconds']

            print(f"\n--- Cluster {cluster_id[:8]}... ---")
            print(f"  Category: {category}")
            print(f"  Objects: {object_count}")
            print(f"  State: {state}")
            print(f"  Confidence: {confidence:.3f}")
            print(f"  Stability: {stability:.3f}")
            print(f"  Age: {age_seconds:.1f}s")
            print(f"  Frames Detected: {tracking['frames_detected']}")
            print(f"  Frames Missed: {tracking['frames_missed']}")

            # Movement and shape analysis
            movement_type = cluster['velocity_analysis']['movement_type']
            shape = cluster['shape_analysis']['shape']

            print(f"  Movement: {movement_type}")
            print(f"  Shape: {shape}")

            # Shape-specific measurements
            if shape == "circle":
                radius = cluster['shape_analysis']['size']['radius']
                print(f"  Circle radius: {radius:.2f}m")
            elif shape == "rectangle":
                width = cluster['shape_analysis']['size']['width']
                height = cluster['shape_analysis']['size']['height']
                print(f"  Rectangle: {width:.2f}m x {height:.2f}m")

            # Predicted position for next frame
            pred_pos = tracking['predicted_position']
            if pred_pos['x'] is not None:
                print(f"  Predicted Position: ({pred_pos['x']:.2f}, {pred_pos['y']:.2f})")

    except Exception as e:
        print(f"Error processing cluster data: {e}")
        import traceback
        traceback.print_exc()

client = mqtt.Client()
client.on_message = on_message
client.connect("broker.scenescape.intel.com", 1883, 60)
client.subscribe("scenescape/analytics/clusters/+")
client.loop_forever()

Cluster Tracking Algorithm#

Overview#

The Cluster Analytics service implements cluster tracking system to maintain cluster identities across video frames. This enables long-term analysis of cluster behavior, movement patterns, and lifecycle dynamics.

Tracking Pipeline#

        graph TD
    A[New Frame Detection] --> B[Group by Category]
    B --> C[Get Existing Clusters]
    C --> D[Hungarian Matching]
    D --> E{Match Found?}
    E -->|Yes| F[Update Cluster]
    E -->|No| G[Create New Cluster]
    F --> H[Update Confidence]
    G --> I[Initialize with NEW state]
    H --> J[Update State Machine]
    I --> J
    J --> K[Update History]
    K --> L[Predict Next Position]
    L --> M{Check Unmatched Clusters}
    M --> N[Mark as Missed]
    N --> O[Reduce Confidence]
    O --> P[Update State]
    P --> Q[Archive if LOST]
    

Hungarian Matching Algorithm#

The system uses the Hungarian algorithm with a multi-feature cost matrix to optimally match new detections to existing tracked clusters:

Cost Calculation:

# Hard constraint: must be same category
if tracked.category != detection.category:
    return INFINITE_COST

# Multi-feature cost matrix (weighted)
position_cost = distance(predicted_position, detection_position) * 0.4
velocity_cost = distance(tracked_velocity, detection_velocity) * 0.3
size_cost = abs(tracked_size - detection_size) * 0.2
shape_cost = (1.0 if shapes_match else 2.0) * 0.1

total_cost = position_cost + velocity_cost + size_cost + shape_cost

Matching Process:

  1. Build cost matrix for all (cluster, detection) pairs

  2. Apply Hungarian algorithm for optimal assignment

  3. Filter matches by maximum distance threshold (default: 5.0 meters)

  4. Return valid matches with similarity scores

State Machine Transitions#

        stateDiagram-v2
    [*] --> NEW: Detection
    NEW --> ACTIVE: 3+ frames detected<br/>confidence > 0.6
    ACTIVE --> STABLE: 20+ frames detected<br/>stability > 0.7
    ACTIVE --> FADING: 15+ frames missed
    STABLE --> FADING: 15+ frames missed
    FADING --> ACTIVE: Redetected
    FADING --> LOST: 10+ frames missed
    LOST --> [*]: Archive after 5s
    

Confidence Metrics#

Detection Consistency:

  • Base confidence = frames_detected / total_frames

  • Represents overall detection reliability

Miss Penalty:

  • Penalty = min(frames_missed * 0.1, 0.5)

  • Reduces confidence for recent detection failures

Longevity Bonus:

  • Bonus = min(frames_detected / 100, 0.2)

  • Rewards long-term stable tracking

Final Confidence:

confidence = clamp(base_confidence - miss_penalty + longevity_bonus, 0.0, 1.0)

Stability Score#

Measures cluster consistency based on recent history (last 10 observations):

Position Stability:

  • Low position variance indicates stable location

  • position_stability = 1.0 / (1.0 + position_variance)

Size Stability:

  • Consistent cluster size over time

  • size_stability = 1.0 / (1.0 + size_variance)

Shape Consistency:

  • Frequency of most common shape

  • shape_consistency = most_common_count / total_observations

Combined Score:

stability_score = (
    0.4 * position_stability +
    0.3 * size_stability +
    0.3 * shape_consistency
)

History Management#

Each tracked cluster maintains historical observations:

Stored Data:

  • Position history: (x, y, timestamp)

  • Velocity history: (vx, vy, timestamp)

  • Size history: object counts

  • Shape history: detected shapes

  • Timestamps: frame timestamps

Limits:

  • Maximum history size: 100 observations

  • Automatic truncation when limit exceeded

  • Maintains most recent observations

Prediction System#

Clusters use linear extrapolation for position prediction:

# Calculate average velocity from recent history (last 5 observations)
avg_velocity = mean(recent_velocities)

# Predict next position (assuming ~1 frame time delta)
predicted_position = current_position + avg_velocity

Benefits:

  • Improves matching accuracy for moving clusters

  • Handles temporary occlusions

  • Reduces false negatives in tracking

Archival System#

Archival Criteria:

  • Cluster state = LOST

  • Time since last seen > 5.0 seconds (configurable)

Archive Management:

  • Maximum 50 archived clusters (global limit)

  • Oldest archived clusters removed when limit exceeded

  • Preserves full history for analysis

Statistics Tracking:

  • Active clusters count

  • Archived clusters count

  • Clusters by state distribution

  • Tracked scenes and categories

DBSCAN Noise Point Explanation#

In the DBSCAN clustering algorithm, noise points are objects that do not belong to any cluster. Understanding noise points is important for interpreting analytics results in the Cluster Analytics microservice.

DBSCAN Algorithm Overview#

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) classifies each data point as one of:

  • Core points: Have at least min_samples neighbors within eps distance.

  • Border points: Are within eps distance of a core point but do not have enough neighbors to be core points themselves.

  • Noise points: Are neither core nor border points—these are isolated from other points.

Noise Points in Cluster Analytics#

In this service, noise points are objects that:

  • Are farther than the configured eps distance (e.g., 1.5 meters) from any other object of the same category.

  • Do not have enough nearby neighbors to form a cluster (fewer than min_samples).

Example Scenarios:

  • Queuing Scene:

    • 5 people detected.

    • 3 people stand close together (within 1.5m): form 1 cluster.

    • 2 people stand alone, each more than 1.5m from others: these are noise points.

  • Retail Scene:

    • 4 people detected.

    • 2 people are near each other: form 1 cluster.

    • 2 people are isolated: noise points.

Code Representation#

In DBSCAN output, objects labeled with -1 are noise points. These represent people or objects that are spatially isolated and do not form meaningful groups with others of the same category.

Why Noise Points Matter#

Identifying noise points helps distinguish between:

  • Clustered behavior: People or objects grouping together.

  • Individual behavior: People or objects standing alone or isolated.

This distinction is valuable for analytics, enabling insights into both group dynamics and solitary activity within a scene.

Logging Benefits#

  • Reduced Log Volume: Eliminates verbose JSON serialization in production

  • Performance: Avoids expensive string formatting when not needed

  • Operational: Clear cluster summaries for monitoring and alerting

  • Debugging: Full metadata available when debug logging is enabled

Contributing#

When contributing to the Cluster Analytics service:

  1. Algorithm Improvements: Enhance clustering accuracy or add new shape detection patterns

  2. Performance Optimization: Optimize processing speed for high-volume scenarios

  3. New Movement Patterns: Add additional velocity analysis classifications

  4. Testing: Include unit tests for clustering and shape detection algorithms

License#

This project is licensed under the Apache 2.0 License. See the LICENSE file for details.