# Cluster Analytics Service The Cluster Analytics service provides advanced object clustering and movement analysis capabilities for Intel® SceneScape using DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm combined with geometric shape detection and velocity pattern classification. This service processes real-time object detection data from Intel® SceneScape scenes, applies machine learning-based clustering algorithms, and provides comprehensive analytics including: - **Spatial Clustering**: Groups objects by proximity using DBSCAN algorithm with user-configurable parameters - **Cluster Tracking**: Tracks clusters across frames with state-based lifecycle management (NEW → ACTIVE → STABLE → FADING → LOST) - **Shape Analysis**: Detects geometric patterns (circle, rectangle, line, irregular) with size measurements - **Velocity Analysis**: Classifies movement patterns and tracks cluster dynamics ## Deployment ### Docker Deployment (Recommended) The cluster analytics service is included in the extended Intel® SceneScape demo docker-compose stack: ```bash SUPASS=admin123 make SUPASS=admin123 make demo-all ``` ### Build from Source Alternatively, see how to [Build from Source](./get-started/build-from-source.md). ## Architecture > **Note:** Diagrams are currently best viewed in light color mode. ### Data Flow Diagram ```mermaid sequenceDiagram participant APP as Applications participant CA as Cluster Analytics participant MQTT as MQTT Broker participant SC as Scene Controller MQTT->>SC: Detections metadata Note over SC: Base analytics SC->>MQTT: Objects metadata MQTT->>CA: Objects metadata Note over CA: User-configurable DBSCAN clustering Note over CA: Cluster's shape and velocity analysis CA->>MQTT: Optimized clusters metadata Note over APP: Real-time cluster insights MQTT->>APP: ``` ### **DBSCAN Clustering Configuration** #### User-Configurable Parameters The `config.json` file allows customization of DBSCAN clustering parameters: - **`eps`** - Maximum distance (in meters) between objects to be considered in the same cluster - **`min_samples`** - Minimum number of objects required to form a cluster These parameters can be configured globally (default) or per object category. #### Configuration File Structure The service uses a `config.json` file located in the `config/` directory: ```json { "dbscan": { "default": { "eps": 1, "min_samples": 3 }, "category_specific": { "person": { "eps": 2, "min_samples": 2 }, "vehicle": { "eps": 4.0, "min_samples": 2 }, "bicycle": { "eps": 1.5, "min_samples": 2 }, "motorcycle": { "eps": 2.5, "min_samples": 2 }, "truck": { "eps": 5.0, "min_samples": 2 }, "bus": { "eps": 6.0, "min_samples": 2 } } } } ``` #### Parameter Descriptions - **`default`**: Fallback parameters for object categories not explicitly configured - **`category_specific`**: Per-category parameters optimized for different object types: - `person` - Optimized for people clustering (social distancing, queues) - `vehicle` - Optimized for vehicle parking, traffic clusters - `bicycle` - Optimized for bike racks, group riding - `motorcycle` - Moderate spacing for motorcycle clusters - `truck` - Large vehicle spacing requirements - `bus` - Bus stops, depot formations ### Shape Detection and Analysis - **ML-based Shape Classification**: Detects geometric patterns using feature extraction - **Size Calculations**: Provides precise measurements for each detected shape type - **Supported Shapes**: - **Circle**: radius, diameter, area, circumference - **Rectangle**: width, height, area, perimeter, corner points - **Line**: length, endpoints, width spread - **Irregular**: bounding box dimensions, point spread #### Shape Detection Logic ```mermaid flowchart TD A[Cluster Points Input] --> B{Sufficient Points?} B -->|< 3 points| C[Insufficient Points] B -->|≥ 3 points| D[Calculate Features] D --> E[Extract Distance and Angle Features] E --> F[Calculate Centroid] F --> G[Measure Distance Variance] G --> H{Distance Variance < 0.5?} H -->|Yes| I[Circle Formation] H -->|No| J{Exactly 4 Points?} J -->|Yes| K[Check Quadrant Distribution] K --> L{≥ 3 Quadrants?} L -->|Yes| M[Rectangle Formation] L -->|No| N[Continue Analysis] J -->|No| O{≥ 5 Points?} O -->|Yes| P[Analyze Angle Distribution] P --> Q{Uniform Distribution?} Q -->|Yes| R[Large Circle Formation] Q -->|No| S[Check Linear Formation] S --> T{Low Triangle Areas?} T -->|Yes| U[Line Formation] T -->|No| V[Irregular Shape] O -->|No| N N --> S %% Shape calculations I --> I1[Calculate: radius, diameter, area, circumference] M --> M1[Calculate: width, height, area, perimeter, corners] R --> R1[Calculate: radius, diameter, area, circumference] U --> U1[Calculate: length, endpoints, width spread] V --> V1[Calculate: bounding box, point spread] ``` ### Velocity Analysis and Movement Patterns - **Movement Classification**: 6 distinct movement patterns - **Velocity Statistics**: Comprehensive speed and direction analysis - **Pattern Types**: - `stationary` - Objects with minimal movement - `coordinated_parallel` - Synchronized movement in same direction - `converging` - Objects moving toward cluster center - `diverging` - Objects moving away from cluster center - `loosely_coordinated` - Some coordination but not highly synchronized - `chaotic` - Random or unpredictable movement patterns #### Velocity Analysis Logic ```mermaid graph TD A[Velocity Analysis] --> B{Speed Check} B -->|< 0.1 m/s| C[Stationary] B -->|> 0.1 m/s| D{Coherence Check} D -->|High Coherence| E[Coordinated Parallel] D -->|Low Coherence| F{Direction Analysis} F -->|Toward Center| G[Converging] F -->|Away from Center| H[Diverging] F -->|Mixed| I[Chaotic] ``` ## Category-Specific Clustering The serviceoptimizes DBSCAN parameters based on object categories, providing more accurate clustering for different object types: ### Benefits - **Optimized Parameters**: Each object type uses clustering parameters optimized for its spatial characteristics - **Better Accuracy**: Improved clustering accuracy by considering object-specific grouping behaviors - **Automatic Selection**: Parameters are selected based on detected object category - **Fallback Support**: Unknown categories use sensible default parameters ### Category Optimization Examples | Category | eps (meters) | min_samples | Rationale | | ------------ | ------------ | ----------- | ---------------------------------------- | | `person` | 2.0 | 2 | Social distancing, queue formations | | `vehicle` | 4.0 | 2 | Parking lots, traffic clusters | | `bicycle` | 1.5 | 2 | Bike racks, tight group riding | | `motorcycle` | 2.5 | 2 | Moderate spacing for motorcycle clusters | | `truck` | 5.0 | 2 | Large vehicle spacing requirements | | `bus` | 6.0 | 2 | Bus stops, depot formations | | `default` | 1.0 | 3 | Fallback for unknown categories | ### Usage in Analysis The service automatically applies appropriate parameters when processing each object category, with user customizations taking precedence: ```python # Dynamic parameter selection with user overrides for category, objects in objects_by_category.items(): # Get user-configured parameters for this scene and category dbscan_params = self.get_dbscan_params_for_category(category, scene_id) clustering = DBSCAN(eps=dbscan_params['eps'], min_samples=dbscan_params['min_samples']) ``` ### **Cluster Tracking System** The service includes advanced temporal tracking with state transitions and confidence scoring. These parameters are currently **hardcoded constants** in the implementation and are not user-configurable through `config.json`. #### State Transition Parameters (Hardcoded) | Parameter | Value | Description | | -------------------- | ----- | ---------------------------------------- | | `FRAMES_TO_ACTIVATE` | 3 | Frames needed to transition NEW → ACTIVE | | `FRAMES_TO_STABLE` | 20 | Frames needed for ACTIVE → STABLE | | `FRAMES_TO_FADE` | 15 | Missed frames before FADING state | | `FRAMES_TO_LOST` | 10 | Missed frames before LOST state | #### Confidence Parameters (Hardcoded) | Parameter | Value | Description | | -------------------------------- | ----- | ------------------------------------ | | `INITIAL_CONFIDENCE` | 0.5 | Starting confidence for new clusters | | `ACTIVATION_THRESHOLD` | 0.6 | Confidence needed for activation | | `STABILITY_THRESHOLD` | 0.7 | Confidence needed for stable state | | `CONFIDENCE_MISS_PENALTY` | 0.1 | Confidence penalty per missed frame | | `CONFIDENCE_MAX_MISS_PENALTY` | 0.5 | Maximum cumulative miss penalty | | `CONFIDENCE_LONGEVITY_BONUS_MAX` | 0.2 | Maximum bonus for long-term tracking | | `CONFIDENCE_LONGEVITY_FRAMES` | 100 | Frames to reach max longevity bonus | #### Archival Parameters (Hardcoded) | Parameter | Value | Description | | ------------------------ | ----- | -------------------------------------- | | `ARCHIVE_TIME_THRESHOLD` | 5.0 | Seconds before archiving lost clusters | | `MAX_ARCHIVED_CLUSTERS` | 50 | Maximum number of archived clusters | #### Cluster Lifecycle States | State | Description | Transition Trigger | | -------- | ------------------------------------ | ------------------------------------------ | | `NEW` | Just detected, awaiting confirmation | Initial detection | | `ACTIVE` | Confirmed and consistently detected | 3+ consecutive detections, confidence >0.6 | | `STABLE` | Long-term stable presence | 20+ frames detected, stability >0.7 | | `FADING` | Recently missed detections | 15+ consecutive missed frames | | `LOST` | Not detected for extended period | 10+ consecutive missed frames | #### Confidence Calculation Cluster tracking confidence is calculated using: ```python # Base confidence from detection ratio base_confidence = frames_detected / total_frames # Penalty for recent misses miss_penalty = min(frames_missed * 0.1, 0.5) # Bonus for long-term tracking longevity_bonus = min(frames_detected / 100, 0.2) # Final confidence (clamped 0-1) confidence = clamp(base_confidence - miss_penalty + longevity_bonus, 0.0, 1.0) ``` ## **WebUI Features and Real-time Visualization** The integrated WebUI provides a comprehensive interface for cluster analysis monitoring and configuration: ### **Interactive Visualization** - **Real-time Canvas**: Live updating visualization of objects and clusters - **Pan and Zoom**: Navigate through scene data with mouse controls - **Object Display**: Individual objects colored by cluster assignment - **Cluster Shapes**: Visual representation of detected cluster geometries - **Movement Vectors**: Optional display of cluster movement with adjustable scaling - **Auto-fit**: Automatic view adjustment to focus on current scene data ### **Dynamic Parameter Configuration** - **Per-Category Controls**: Independent parameter adjustment for each object category - **Real-time Updates**: Changes apply immediately with automatic re-clustering - **Scene-Specific Settings**: Each scene maintains its own parameter configuration - **Reset to Defaults**: Quick restoration of default parameters per category - **Visual Feedback**: Immediate visualization of parameter change effects ### **Scene Management** - **Multi-Scene Support**: Switch between available scenes dynamically - **Auto-Discovery**: Scenes are automatically discovered from MQTT traffic - **Current Data Focus**: Always displays current state without historic accumulation - **Object Count Display**: Real-time object and cluster statistics ### **Advanced Controls** - **Refresh Rate**: Configurable from real-time to custom intervals - **Movement Vector Scaling**: Adjustable visualization scale for velocity vectors - **Connection Status**: Live MQTT connection monitoring - **Parameter Validation**: Intelligent validation based on actual scene data ### **Insufficient Points Handling** - **Individual Object Coloring**: Objects are colored by category when clusters cannot be formed - **Clear Messaging**: Visual indication when clustering is not possible - **Dynamic Thresholds**: Uses user-configured min_samples rather than global defaults ## MQTT Topics and Data Flow ### Input Topics - **Topic**: `scenescape/regulated/scene/{scene_id}` - **Purpose**: Receives object detection data from Intel® SceneScape scenes - **Format**: JSON with objects array and scene metadata - **Contains**: Scene name, timestamp, object detections with world coordinates ### Output Topics - **Topic**: `scenescape/analytics/clusters/{scene_id}` - **Purpose**: Publishes cluster analysis results - **QoS**: 1 (at least once delivery) - **Optimized Structure**: Contains only cluster data without redundant scene metadata ### Topic Structure Changes **Recent Optimization**: Scene identification is now derived from topic structure rather than payload content: - **Scene ID**: Extracted from topic path (`{scene_id}` component) - **Scene Name**: Retrieved from DATA_REGULATED topic - **Cluster Data**: Published to ANALYTICS_CLUSTERS contains only analysis results ## Output Data Structure The Cluster Analytics service publishes optimized cluster metadata in batch format. **Note**: Scene identification is extracted from topic structure, not payload content. ### Cluster Batch Format ```json { "scene_id": "3bc091c7-e449-46a0-9540-29c499bca18c", "scene_name": "Retail", "timestamp": "2025-10-21T09:16:41.377Z", "total_clusters": 2, "clusters": [ { "cluster_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", "category": "person", "objects_in_cluster": 8, "cluster_center": { "x": 4.291512867202579, "y": 4.934464049998539 }, "shape_analysis": { "shape": "circle", "size": { "radius": 0.38788961696255303, "diameter": 0.7757792339251061, "area": 0.4726788625738194, "circumference": 2.437182342106631 } }, "velocity_analysis": { "movement_type": "chaotic", "average_velocity": [-0.19217192568910546, -0.0763952946379476, 0.0], "velocity_magnitude": 0.20680012104899237, "movement_direction_degrees": -158.32038869788497, "velocity_coherence": 0.0 }, "object_ids": [ "69de7c1c-21da-45bc-ae45-2f1d3d16d5b2", "5baec5fa-c961-4dc0-a254-f1f614292619", "bf1923d8-ac12-4042-9e76-9b57b351efcb", "e6333708-3793-4e44-9b29-e1b7e0e7977c", "d9b6d6a9-d390-47a4-a9b8-95af121103ca", "9be324af-c0a5-4495-bae6-33d251e88366", "166ba387-9b4e-406d-b236-a30bb274a800", "71a1b1f6-8e14-4a22-a656-011fa4405c43" ], "dbscan_params": { "eps": 0.5, "min_samples": 3, "category": "person" }, "tracking": { "tracking_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", "state": "active", "confidence": 0.875, "stability_score": 0.623, "frames_detected": 15, "frames_missed": 0, "age_seconds": 2.5, "time_since_last_seen": 0.03, "first_seen": 1729501599.234, "last_seen": 1729501601.734, "predicted_position": { "x": 4.32, "y": 4.91 } } } ], "summary": { "categories": ["person"], "total_objects_in_clusters": 8 }, "tracking_statistics": { "active_clusters": 2, "archived_clusters": 5, "clusters_by_state": { "new": 0, "active": 1, "stable": 1, "fading": 0, "lost": 0 }, "tracked_scenes": 2, "tracked_categories": 1 } } ``` ## Field Descriptions ### Batch-Level Fields | Field | Type | Description | | ----------------------------------- | ------- | ---------------------------------------------- | | `scene_id` | String | Unique scene identifier (UUID) | | `scene_name` | String | Human-readable scene name | | `timestamp` | String | ISO 8601 timestamp when clusters were detected | | `total_clusters` | Integer | Total number of clusters in this batch | | `clusters` | Array | Array of individual cluster objects | | `summary.categories` | Array | List of object categories that formed clusters | | `summary.total_objects_in_clusters` | Integer | Total objects across all clusters | | `tracking_statistics` | Object | Global tracking system statistics | ### Individual Cluster Fields | Field | Type | Description | | -------------------- | ------- | ------------------------------------------------- | | `cluster_id` | String | Unique persistent cluster UUID | | `category` | String | Object detection category (person, vehicle, etc.) | | `objects_in_cluster` | Integer | Number of objects forming the cluster | | `object_ids` | Array | List of object UUIDs that form this cluster | | `dbscan_params` | Object | User-configured DBSCAN parameters used | | `tracking` | Object | Temporal tracking metadata (see below) | ### Spatial Information | Field | Type | Description | | ------------------ | ----- | ---------------------------------------------------- | | `cluster_center.x` | Float | X coordinate of cluster centroid (world coordinates) | | `cluster_center.y` | Float | Y coordinate of cluster centroid (world coordinates) | ### Shape Analysis | Field | Type | Description | | ---------------------- | ------ | --------------------------------------------------------------- | | `shape_analysis.shape` | String | Detected shape type: `circle`, `rectangle`, `line`, `irregular` | | `shape_analysis.size` | Object | Shape-specific measurements (varies by shape type) | #### Shape-Specific Size Fields **Circle:** - `radius` - Circle radius in meters - `diameter` - Circle diameter in meters - `area` - Circle area in square meters - `circumference` - Circle circumference in meters **Rectangle:** - `width` - Rectangle width in meters - `height` - Rectangle height in meters - `area` - Rectangle area in square meters - `perimeter` - Rectangle perimeter in meters - `corner_points` - Array of [x,y] corner coordinates **Line:** - `length` - Line length in meters - `endpoints` - Array of two [x,y] endpoint coordinates - `width_spread` - Standard deviation of perpendicular distances **Irregular:** - `bounding_width` - Bounding box width in meters - `bounding_height` - Bounding box height in meters - `bounding_area` - Bounding box area in square meters - `point_spread` - Standard deviation of distances from centroid ### Velocity Analysis | Field | Type | Description | | ---------------------------- | ------------ | ------------------------------------------- | | `movement_type` | String | Classified movement pattern | | `average_velocity` | Array[Float] | [vx, vy, vz] average velocity vector in m/s | | `velocity_magnitude` | Float | Average speed magnitude in m/s | | `movement_direction_degrees` | Float | Movement direction in degrees (-180 to 180) | | `velocity_coherence` | Float | Movement synchronization measure (0-1) | ### Tracking Metadata | Field | Type | Description | | ------------------------------- | ------- | ------------------------------------------------------- | | `tracking.tracking_id` | String | Persistent cluster UUID (same as cluster_id) | | `tracking.state` | String | Current lifecycle state (new/active/stable/fading/lost) | | `tracking.confidence` | Float | Tracking confidence score (0-1) | | `tracking.stability_score` | Float | Cluster stability metric (0-1) | | `tracking.frames_detected` | Integer | Total frames where cluster was detected | | `tracking.frames_missed` | Integer | Consecutive frames where cluster was not detected | | `tracking.age_seconds` | Float | Time since first detection (seconds) | | `tracking.time_since_last_seen` | Float | Time since last detection (seconds) | | `tracking.first_seen` | Float | Unix timestamp of first detection | | `tracking.last_seen` | Float | Unix timestamp of last detection | | `tracking.predicted_position.x` | Float | Predicted X coordinate for next frame | | `tracking.predicted_position.y` | Float | Predicted Y coordinate for next frame | ### Tracking Statistics | Field | Type | Description | | ---------------------------------------- | ------- | ----------------------------------------- | | `tracking_statistics.active_clusters` | Integer | Total active clusters across all scenes | | `tracking_statistics.archived_clusters` | Integer | Total archived (lost) clusters | | `tracking_statistics.clusters_by_state` | Object | Count of clusters in each lifecycle state | | `tracking_statistics.tracked_scenes` | Integer | Number of scenes with active clusters | | `tracking_statistics.tracked_categories` | Integer | Number of object categories being tracked | ### Movement Pattern Classifications | Pattern | Description | Criteria | | ---------------------- | ----------------------- | -------------------------------------------- | | `stationary` | Minimal movement | Average speed < 0.1 m/s | | `coordinated_parallel` | Synchronized movement | Velocity coherence > 0.3 | | `converging` | Moving toward center | >60% objects moving toward cluster center | | `diverging` | Moving away from center | >60% objects moving away from cluster center | | `loosely_coordinated` | Some coordination | Velocity coherence 0.2-0.3 | | `chaotic` | Random movement | Low velocity coherence, mixed directions | ### Administrative Fields | Field | Type | Description | | --------------------------- | ------------- | ------------------------------------------------------- | | `object_ids` | Array[String] | List of individual object IDs in the cluster | | `dbscan_params.eps` | Float | DBSCAN epsilon parameter used for this category | | `dbscan_params.min_samples` | Integer | DBSCAN minimum samples parameter used for this category | | `dbscan_params.category` | String | Object category for which parameters were optimized | ## Production Data Analysis ### Real Deployment Performance Based on actual production deployment on `broker.scenescape.intel.com`: - **Active Scenes**: "Queuing" (`302cf49a-97ec-402d-a324-c5077b280b7b`), "Retail" (`3bc091c7-e449-46a0-9540-29c499bca18c`) - **Object Volume**: 62 person objects per frame in busy queuing scenarios - **Cluster Formation**: Typically 2 clusters formed (42-43 objects in main cluster, 4 objects in secondary cluster) - **Noise Points**: 15-17 unclustered objects (24-27% noise ratio) - **Shape Patterns**: 100% circle formations observed in production - **Movement Types**: Mix of "chaotic" (main clusters) and "stationary" (small clusters) ### Performance Characteristics - **Processing Speed**: Real-time analysis of 60+ objects per frame - **Network Connectivity**: Reliable MQTT connectivity to production broker - **Shape Detection**: Consistent circle detection with radius measurements 0.16-0.87 meters - **Velocity Analysis**: Accurate movement classification with coherence measurements ## Usage Examples ### Real-time Monitoring Subscribe to the ANALYTICS_CLUSTERS topic to receive live cluster updates: ```bash mosquitto_sub -h broker.scenescape.intel.com -t "scenescape/analytics/clusters/+" -v ``` ### Processing Cluster Data Example Python code to process cluster metadata with tracking information: ```python import json import paho.mqtt.client as mqtt def on_message(client, userdata, message): try: cluster_batch = json.loads(message.payload.decode()) scene_name = cluster_batch['scene_name'] scene_id = cluster_batch['scene_id'] total_clusters = cluster_batch['total_clusters'] print(f"\n=== Scene: {scene_name} ({scene_id}) ===") print(f"Total Clusters: {total_clusters}") # Process tracking statistics stats = cluster_batch.get('tracking_statistics', {}) print(f"\nTracking Statistics:") print(f" Active Clusters: {stats.get('active_clusters', 0)}") print(f" Archived Clusters: {stats.get('archived_clusters', 0)}") state_counts = stats.get('clusters_by_state', {}) print(f" States: {state_counts}") # Process individual clusters for cluster in cluster_batch['clusters']: cluster_id = cluster['cluster_id'] category = cluster['category'] object_count = cluster['objects_in_cluster'] # Tracking information tracking = cluster['tracking'] state = tracking['state'] confidence = tracking['confidence'] stability = tracking['stability_score'] age_seconds = tracking['age_seconds'] print(f"\n--- Cluster {cluster_id[:8]}... ---") print(f" Category: {category}") print(f" Objects: {object_count}") print(f" State: {state}") print(f" Confidence: {confidence:.3f}") print(f" Stability: {stability:.3f}") print(f" Age: {age_seconds:.1f}s") print(f" Frames Detected: {tracking['frames_detected']}") print(f" Frames Missed: {tracking['frames_missed']}") # Movement and shape analysis movement_type = cluster['velocity_analysis']['movement_type'] shape = cluster['shape_analysis']['shape'] print(f" Movement: {movement_type}") print(f" Shape: {shape}") # Shape-specific measurements if shape == "circle": radius = cluster['shape_analysis']['size']['radius'] print(f" Circle radius: {radius:.2f}m") elif shape == "rectangle": width = cluster['shape_analysis']['size']['width'] height = cluster['shape_analysis']['size']['height'] print(f" Rectangle: {width:.2f}m x {height:.2f}m") # Predicted position for next frame pred_pos = tracking['predicted_position'] if pred_pos['x'] is not None: print(f" Predicted Position: ({pred_pos['x']:.2f}, {pred_pos['y']:.2f})") except Exception as e: print(f"Error processing cluster data: {e}") import traceback traceback.print_exc() client = mqtt.Client() client.on_message = on_message client.connect("broker.scenescape.intel.com", 1883, 60) client.subscribe("scenescape/analytics/clusters/+") client.loop_forever() ``` ## **Cluster Tracking Algorithm** ### Overview The Cluster Analytics service implements cluster tracking system to maintain cluster identities across video frames. This enables long-term analysis of cluster behavior, movement patterns, and lifecycle dynamics. ### Tracking Pipeline ```mermaid graph TD A[New Frame Detection] --> B[Group by Category] B --> C[Get Existing Clusters] C --> D[Hungarian Matching] D --> E{Match Found?} E -->|Yes| F[Update Cluster] E -->|No| G[Create New Cluster] F --> H[Update Confidence] G --> I[Initialize with NEW state] H --> J[Update State Machine] I --> J J --> K[Update History] K --> L[Predict Next Position] L --> M{Check Unmatched Clusters} M --> N[Mark as Missed] N --> O[Reduce Confidence] O --> P[Update State] P --> Q[Archive if LOST] ``` ### Hungarian Matching Algorithm The system uses the Hungarian algorithm with a multi-feature cost matrix to optimally match new detections to existing tracked clusters: **Cost Calculation:** ```python # Hard constraint: must be same category if tracked.category != detection.category: return INFINITE_COST # Multi-feature cost matrix (weighted) position_cost = distance(predicted_position, detection_position) * 0.4 velocity_cost = distance(tracked_velocity, detection_velocity) * 0.3 size_cost = abs(tracked_size - detection_size) * 0.2 shape_cost = (1.0 if shapes_match else 2.0) * 0.1 total_cost = position_cost + velocity_cost + size_cost + shape_cost ``` **Matching Process:** 1. Build cost matrix for all (cluster, detection) pairs 2. Apply Hungarian algorithm for optimal assignment 3. Filter matches by maximum distance threshold (default: 5.0 meters) 4. Return valid matches with similarity scores ### State Machine Transitions ```mermaid stateDiagram-v2 [*] --> NEW: Detection NEW --> ACTIVE: 3+ frames detected
confidence > 0.6 ACTIVE --> STABLE: 20+ frames detected
stability > 0.7 ACTIVE --> FADING: 15+ frames missed STABLE --> FADING: 15+ frames missed FADING --> ACTIVE: Redetected FADING --> LOST: 10+ frames missed LOST --> [*]: Archive after 5s ``` ### Confidence Metrics **Detection Consistency:** - Base confidence = frames_detected / total_frames - Represents overall detection reliability **Miss Penalty:** - Penalty = min(frames_missed \* 0.1, 0.5) - Reduces confidence for recent detection failures **Longevity Bonus:** - Bonus = min(frames_detected / 100, 0.2) - Rewards long-term stable tracking **Final Confidence:** ```python confidence = clamp(base_confidence - miss_penalty + longevity_bonus, 0.0, 1.0) ``` ### Stability Score Measures cluster consistency based on recent history (last 10 observations): **Position Stability:** - Low position variance indicates stable location - `position_stability = 1.0 / (1.0 + position_variance)` **Size Stability:** - Consistent cluster size over time - `size_stability = 1.0 / (1.0 + size_variance)` **Shape Consistency:** - Frequency of most common shape - `shape_consistency = most_common_count / total_observations` **Combined Score:** ```python stability_score = ( 0.4 * position_stability + 0.3 * size_stability + 0.3 * shape_consistency ) ``` ### History Management Each tracked cluster maintains historical observations: **Stored Data:** - Position history: (x, y, timestamp) - Velocity history: (vx, vy, timestamp) - Size history: object counts - Shape history: detected shapes - Timestamps: frame timestamps **Limits:** - Maximum history size: 100 observations - Automatic truncation when limit exceeded - Maintains most recent observations ### Prediction System Clusters use linear extrapolation for position prediction: ```python # Calculate average velocity from recent history (last 5 observations) avg_velocity = mean(recent_velocities) # Predict next position (assuming ~1 frame time delta) predicted_position = current_position + avg_velocity ``` **Benefits:** - Improves matching accuracy for moving clusters - Handles temporary occlusions - Reduces false negatives in tracking ### Archival System **Archival Criteria:** - Cluster state = LOST - Time since last seen > 5.0 seconds (configurable) **Archive Management:** - Maximum 50 archived clusters (global limit) - Oldest archived clusters removed when limit exceeded - Preserves full history for analysis **Statistics Tracking:** - Active clusters count - Archived clusters count - Clusters by state distribution - Tracked scenes and categories ## DBSCAN Noise Point Explanation In the DBSCAN clustering algorithm, **noise points** are objects that do not belong to any cluster. Understanding noise points is important for interpreting analytics results in the Cluster Analytics microservice. ### DBSCAN Algorithm Overview DBSCAN (Density-Based Spatial Clustering of Applications with Noise) classifies each data point as one of: - **Core points**: Have at least `min_samples` neighbors within `eps` distance. - **Border points**: Are within `eps` distance of a core point but do not have enough neighbors to be core points themselves. - **Noise points**: Are neither core nor border points—these are isolated from other points. ### Noise Points in Cluster Analytics In this service, noise points are objects that: - Are farther than the configured `eps` distance (e.g., 1.5 meters) from any other object of the same category. - Do not have enough nearby neighbors to form a cluster (fewer than `min_samples`). **Example Scenarios:** - **Queuing Scene**: - 5 people detected. - 3 people stand close together (within 1.5m): form 1 cluster. - 2 people stand alone, each more than 1.5m from others: these are noise points. - **Retail Scene**: - 4 people detected. - 2 people are near each other: form 1 cluster. - 2 people are isolated: noise points. ### Code Representation In DBSCAN output, objects labeled with `-1` are noise points. These represent people or objects that are spatially isolated and do not form meaningful groups with others of the same category. ### Why Noise Points Matter Identifying noise points helps distinguish between: - **Clustered behavior**: People or objects grouping together. - **Individual behavior**: People or objects standing alone or isolated. This distinction is valuable for analytics, enabling insights into both group dynamics and solitary activity within a scene. ### Logging Benefits - **Reduced Log Volume**: Eliminates verbose JSON serialization in production - **Performance**: Avoids expensive string formatting when not needed - **Operational**: Clear cluster summaries for monitoring and alerting - **Debugging**: Full metadata available when debug logging is enabled ## Contributing When contributing to the Cluster Analytics service: 1. **Algorithm Improvements**: Enhance clustering accuracy or add new shape detection patterns 2. **Performance Optimization**: Optimize processing speed for high-volume scenarios 3. **New Movement Patterns**: Add additional velocity analysis classifications 4. **Testing**: Include unit tests for clustering and shape detection algorithms ## License This project is licensed under the Apache 2.0 License. See the LICENSE file for details. :::{toctree} :hidden: get-started.md :::