# Cluster Analytics Service
The Cluster Analytics service provides advanced object clustering and movement analysis capabilities for Intel® SceneScape using DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm combined with geometric shape detection and velocity pattern classification.
This service processes real-time object detection data from Intel® SceneScape scenes, applies machine learning-based clustering algorithms, and provides comprehensive analytics including:
- **Spatial Clustering**: Groups objects by proximity using DBSCAN algorithm with user-configurable parameters
- **Cluster Tracking**: Tracks clusters across frames with state-based lifecycle management (NEW → ACTIVE → STABLE → FADING → LOST)
- **Shape Analysis**: Detects geometric patterns (circle, rectangle, line, irregular) with size measurements
- **Velocity Analysis**: Classifies movement patterns and tracks cluster dynamics
## Deployment
### Docker Deployment (Recommended)
The cluster analytics service is included in the extended Intel® SceneScape demo docker-compose stack:
```bash
SUPASS=admin123 make
SUPASS=admin123 make demo-all
```
### Build from Source
Alternatively, see how to [Build from Source](./get-started/build-from-source.md).
## Architecture
> **Note:** Diagrams are currently best viewed in light color mode.
### Data Flow Diagram
```mermaid
sequenceDiagram
participant APP as Applications
participant CA as Cluster Analytics
participant MQTT as MQTT Broker
participant SC as Scene Controller
MQTT->>SC: Detections metadata
Note over SC: Base analytics
SC->>MQTT: Objects metadata
MQTT->>CA: Objects metadata
Note over CA: User-configurable DBSCAN clustering
Note over CA: Cluster's shape and velocity analysis
CA->>MQTT: Optimized clusters metadata
Note over APP: Real-time cluster insights
MQTT->>APP:
```
### **DBSCAN Clustering Configuration**
#### User-Configurable Parameters
The `config.json` file allows customization of DBSCAN clustering parameters:
- **`eps`** - Maximum distance (in meters) between objects to be considered in the same cluster
- **`min_samples`** - Minimum number of objects required to form a cluster
These parameters can be configured globally (default) or per object category.
#### Configuration File Structure
The service uses a `config.json` file located in the `config/` directory:
```json
{
"dbscan": {
"default": {
"eps": 1,
"min_samples": 3
},
"category_specific": {
"person": {
"eps": 2,
"min_samples": 2
},
"vehicle": {
"eps": 4.0,
"min_samples": 2
},
"bicycle": {
"eps": 1.5,
"min_samples": 2
},
"motorcycle": {
"eps": 2.5,
"min_samples": 2
},
"truck": {
"eps": 5.0,
"min_samples": 2
},
"bus": {
"eps": 6.0,
"min_samples": 2
}
}
}
}
```
#### Parameter Descriptions
- **`default`**: Fallback parameters for object categories not explicitly configured
- **`category_specific`**: Per-category parameters optimized for different object types:
- `person` - Optimized for people clustering (social distancing, queues)
- `vehicle` - Optimized for vehicle parking, traffic clusters
- `bicycle` - Optimized for bike racks, group riding
- `motorcycle` - Moderate spacing for motorcycle clusters
- `truck` - Large vehicle spacing requirements
- `bus` - Bus stops, depot formations
### Shape Detection and Analysis
- **ML-based Shape Classification**: Detects geometric patterns using feature extraction
- **Size Calculations**: Provides precise measurements for each detected shape type
- **Supported Shapes**:
- **Circle**: radius, diameter, area, circumference
- **Rectangle**: width, height, area, perimeter, corner points
- **Line**: length, endpoints, width spread
- **Irregular**: bounding box dimensions, point spread
#### Shape Detection Logic
```mermaid
flowchart TD
A[Cluster Points Input] --> B{Sufficient Points?}
B -->|< 3 points| C[Insufficient Points]
B -->|≥ 3 points| D[Calculate Features]
D --> E[Extract Distance and Angle Features]
E --> F[Calculate Centroid]
F --> G[Measure Distance Variance]
G --> H{Distance Variance < 0.5?}
H -->|Yes| I[Circle Formation]
H -->|No| J{Exactly 4 Points?}
J -->|Yes| K[Check Quadrant Distribution]
K --> L{≥ 3 Quadrants?}
L -->|Yes| M[Rectangle Formation]
L -->|No| N[Continue Analysis]
J -->|No| O{≥ 5 Points?}
O -->|Yes| P[Analyze Angle Distribution]
P --> Q{Uniform Distribution?}
Q -->|Yes| R[Large Circle Formation]
Q -->|No| S[Check Linear Formation]
S --> T{Low Triangle Areas?}
T -->|Yes| U[Line Formation]
T -->|No| V[Irregular Shape]
O -->|No| N
N --> S
%% Shape calculations
I --> I1[Calculate: radius, diameter, area, circumference]
M --> M1[Calculate: width, height, area, perimeter, corners]
R --> R1[Calculate: radius, diameter, area, circumference]
U --> U1[Calculate: length, endpoints, width spread]
V --> V1[Calculate: bounding box, point spread]
```
### Velocity Analysis and Movement Patterns
- **Movement Classification**: 6 distinct movement patterns
- **Velocity Statistics**: Comprehensive speed and direction analysis
- **Pattern Types**:
- `stationary` - Objects with minimal movement
- `coordinated_parallel` - Synchronized movement in same direction
- `converging` - Objects moving toward cluster center
- `diverging` - Objects moving away from cluster center
- `loosely_coordinated` - Some coordination but not highly synchronized
- `chaotic` - Random or unpredictable movement patterns
#### Velocity Analysis Logic
```mermaid
graph TD
A[Velocity Analysis] --> B{Speed Check}
B -->|< 0.1 m/s| C[Stationary]
B -->|> 0.1 m/s| D{Coherence Check}
D -->|High Coherence| E[Coordinated Parallel]
D -->|Low Coherence| F{Direction Analysis}
F -->|Toward Center| G[Converging]
F -->|Away from Center| H[Diverging]
F -->|Mixed| I[Chaotic]
```
## Category-Specific Clustering
The serviceoptimizes DBSCAN parameters based on object categories, providing more accurate clustering for different object types:
### Benefits
- **Optimized Parameters**: Each object type uses clustering parameters optimized for its spatial characteristics
- **Better Accuracy**: Improved clustering accuracy by considering object-specific grouping behaviors
- **Automatic Selection**: Parameters are selected based on detected object category
- **Fallback Support**: Unknown categories use sensible default parameters
### Category Optimization Examples
| Category | eps (meters) | min_samples | Rationale |
| ------------ | ------------ | ----------- | ---------------------------------------- |
| `person` | 2.0 | 2 | Social distancing, queue formations |
| `vehicle` | 4.0 | 2 | Parking lots, traffic clusters |
| `bicycle` | 1.5 | 2 | Bike racks, tight group riding |
| `motorcycle` | 2.5 | 2 | Moderate spacing for motorcycle clusters |
| `truck` | 5.0 | 2 | Large vehicle spacing requirements |
| `bus` | 6.0 | 2 | Bus stops, depot formations |
| `default` | 1.0 | 3 | Fallback for unknown categories |
### Usage in Analysis
The service automatically applies appropriate parameters when processing each object category, with user customizations taking precedence:
```python
# Dynamic parameter selection with user overrides
for category, objects in objects_by_category.items():
# Get user-configured parameters for this scene and category
dbscan_params = self.get_dbscan_params_for_category(category, scene_id)
clustering = DBSCAN(eps=dbscan_params['eps'],
min_samples=dbscan_params['min_samples'])
```
### **Cluster Tracking System**
The service includes advanced temporal tracking with state transitions and confidence scoring. These parameters are currently **hardcoded constants** in the implementation and are not user-configurable through `config.json`.
#### State Transition Parameters (Hardcoded)
| Parameter | Value | Description |
| -------------------- | ----- | ---------------------------------------- |
| `FRAMES_TO_ACTIVATE` | 3 | Frames needed to transition NEW → ACTIVE |
| `FRAMES_TO_STABLE` | 20 | Frames needed for ACTIVE → STABLE |
| `FRAMES_TO_FADE` | 15 | Missed frames before FADING state |
| `FRAMES_TO_LOST` | 10 | Missed frames before LOST state |
#### Confidence Parameters (Hardcoded)
| Parameter | Value | Description |
| -------------------------------- | ----- | ------------------------------------ |
| `INITIAL_CONFIDENCE` | 0.5 | Starting confidence for new clusters |
| `ACTIVATION_THRESHOLD` | 0.6 | Confidence needed for activation |
| `STABILITY_THRESHOLD` | 0.7 | Confidence needed for stable state |
| `CONFIDENCE_MISS_PENALTY` | 0.1 | Confidence penalty per missed frame |
| `CONFIDENCE_MAX_MISS_PENALTY` | 0.5 | Maximum cumulative miss penalty |
| `CONFIDENCE_LONGEVITY_BONUS_MAX` | 0.2 | Maximum bonus for long-term tracking |
| `CONFIDENCE_LONGEVITY_FRAMES` | 100 | Frames to reach max longevity bonus |
#### Archival Parameters (Hardcoded)
| Parameter | Value | Description |
| ------------------------ | ----- | -------------------------------------- |
| `ARCHIVE_TIME_THRESHOLD` | 5.0 | Seconds before archiving lost clusters |
| `MAX_ARCHIVED_CLUSTERS` | 50 | Maximum number of archived clusters |
#### Cluster Lifecycle States
| State | Description | Transition Trigger |
| -------- | ------------------------------------ | ------------------------------------------ |
| `NEW` | Just detected, awaiting confirmation | Initial detection |
| `ACTIVE` | Confirmed and consistently detected | 3+ consecutive detections, confidence >0.6 |
| `STABLE` | Long-term stable presence | 20+ frames detected, stability >0.7 |
| `FADING` | Recently missed detections | 15+ consecutive missed frames |
| `LOST` | Not detected for extended period | 10+ consecutive missed frames |
#### Confidence Calculation
Cluster tracking confidence is calculated using:
```python
# Base confidence from detection ratio
base_confidence = frames_detected / total_frames
# Penalty for recent misses
miss_penalty = min(frames_missed * 0.1, 0.5)
# Bonus for long-term tracking
longevity_bonus = min(frames_detected / 100, 0.2)
# Final confidence (clamped 0-1)
confidence = clamp(base_confidence - miss_penalty + longevity_bonus, 0.0, 1.0)
```
## **WebUI Features and Real-time Visualization**
The integrated WebUI provides a comprehensive interface for cluster analysis monitoring and configuration:
### **Interactive Visualization**
- **Real-time Canvas**: Live updating visualization of objects and clusters
- **Pan and Zoom**: Navigate through scene data with mouse controls
- **Object Display**: Individual objects colored by cluster assignment
- **Cluster Shapes**: Visual representation of detected cluster geometries
- **Movement Vectors**: Optional display of cluster movement with adjustable scaling
- **Auto-fit**: Automatic view adjustment to focus on current scene data
### **Dynamic Parameter Configuration**
- **Per-Category Controls**: Independent parameter adjustment for each object category
- **Real-time Updates**: Changes apply immediately with automatic re-clustering
- **Scene-Specific Settings**: Each scene maintains its own parameter configuration
- **Reset to Defaults**: Quick restoration of default parameters per category
- **Visual Feedback**: Immediate visualization of parameter change effects
### **Scene Management**
- **Multi-Scene Support**: Switch between available scenes dynamically
- **Auto-Discovery**: Scenes are automatically discovered from MQTT traffic
- **Current Data Focus**: Always displays current state without historic accumulation
- **Object Count Display**: Real-time object and cluster statistics
### **Advanced Controls**
- **Refresh Rate**: Configurable from real-time to custom intervals
- **Movement Vector Scaling**: Adjustable visualization scale for velocity vectors
- **Connection Status**: Live MQTT connection monitoring
- **Parameter Validation**: Intelligent validation based on actual scene data
### **Insufficient Points Handling**
- **Individual Object Coloring**: Objects are colored by category when clusters cannot be formed
- **Clear Messaging**: Visual indication when clustering is not possible
- **Dynamic Thresholds**: Uses user-configured min_samples rather than global defaults
## MQTT Topics and Data Flow
### Input Topics
- **Topic**: `scenescape/regulated/scene/{scene_id}`
- **Purpose**: Receives object detection data from Intel® SceneScape scenes
- **Format**: JSON with objects array and scene metadata
- **Contains**: Scene name, timestamp, object detections with world coordinates
### Output Topics
- **Topic**: `scenescape/analytics/clusters/{scene_id}`
- **Purpose**: Publishes cluster analysis results
- **QoS**: 1 (at least once delivery)
- **Optimized Structure**: Contains only cluster data without redundant scene metadata
### Topic Structure Changes
**Recent Optimization**: Scene identification is now derived from topic structure rather than payload content:
- **Scene ID**: Extracted from topic path (`{scene_id}` component)
- **Scene Name**: Retrieved from DATA_REGULATED topic
- **Cluster Data**: Published to ANALYTICS_CLUSTERS contains only analysis results
## Output Data Structure
The Cluster Analytics service publishes optimized cluster metadata in batch format. **Note**: Scene identification is extracted from topic structure, not payload content.
### Cluster Batch Format
```json
{
"scene_id": "3bc091c7-e449-46a0-9540-29c499bca18c",
"scene_name": "Retail",
"timestamp": "2025-10-21T09:16:41.377Z",
"total_clusters": 2,
"clusters": [
{
"cluster_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"category": "person",
"objects_in_cluster": 8,
"cluster_center": {
"x": 4.291512867202579,
"y": 4.934464049998539
},
"shape_analysis": {
"shape": "circle",
"size": {
"radius": 0.38788961696255303,
"diameter": 0.7757792339251061,
"area": 0.4726788625738194,
"circumference": 2.437182342106631
}
},
"velocity_analysis": {
"movement_type": "chaotic",
"average_velocity": [-0.19217192568910546, -0.0763952946379476, 0.0],
"velocity_magnitude": 0.20680012104899237,
"movement_direction_degrees": -158.32038869788497,
"velocity_coherence": 0.0
},
"object_ids": [
"69de7c1c-21da-45bc-ae45-2f1d3d16d5b2",
"5baec5fa-c961-4dc0-a254-f1f614292619",
"bf1923d8-ac12-4042-9e76-9b57b351efcb",
"e6333708-3793-4e44-9b29-e1b7e0e7977c",
"d9b6d6a9-d390-47a4-a9b8-95af121103ca",
"9be324af-c0a5-4495-bae6-33d251e88366",
"166ba387-9b4e-406d-b236-a30bb274a800",
"71a1b1f6-8e14-4a22-a656-011fa4405c43"
],
"dbscan_params": {
"eps": 0.5,
"min_samples": 3,
"category": "person"
},
"tracking": {
"tracking_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"state": "active",
"confidence": 0.875,
"stability_score": 0.623,
"frames_detected": 15,
"frames_missed": 0,
"age_seconds": 2.5,
"time_since_last_seen": 0.03,
"first_seen": 1729501599.234,
"last_seen": 1729501601.734,
"predicted_position": {
"x": 4.32,
"y": 4.91
}
}
}
],
"summary": {
"categories": ["person"],
"total_objects_in_clusters": 8
},
"tracking_statistics": {
"active_clusters": 2,
"archived_clusters": 5,
"clusters_by_state": {
"new": 0,
"active": 1,
"stable": 1,
"fading": 0,
"lost": 0
},
"tracked_scenes": 2,
"tracked_categories": 1
}
}
```
## Field Descriptions
### Batch-Level Fields
| Field | Type | Description |
| ----------------------------------- | ------- | ---------------------------------------------- |
| `scene_id` | String | Unique scene identifier (UUID) |
| `scene_name` | String | Human-readable scene name |
| `timestamp` | String | ISO 8601 timestamp when clusters were detected |
| `total_clusters` | Integer | Total number of clusters in this batch |
| `clusters` | Array | Array of individual cluster objects |
| `summary.categories` | Array | List of object categories that formed clusters |
| `summary.total_objects_in_clusters` | Integer | Total objects across all clusters |
| `tracking_statistics` | Object | Global tracking system statistics |
### Individual Cluster Fields
| Field | Type | Description |
| -------------------- | ------- | ------------------------------------------------- |
| `cluster_id` | String | Unique persistent cluster UUID |
| `category` | String | Object detection category (person, vehicle, etc.) |
| `objects_in_cluster` | Integer | Number of objects forming the cluster |
| `object_ids` | Array | List of object UUIDs that form this cluster |
| `dbscan_params` | Object | User-configured DBSCAN parameters used |
| `tracking` | Object | Temporal tracking metadata (see below) |
### Spatial Information
| Field | Type | Description |
| ------------------ | ----- | ---------------------------------------------------- |
| `cluster_center.x` | Float | X coordinate of cluster centroid (world coordinates) |
| `cluster_center.y` | Float | Y coordinate of cluster centroid (world coordinates) |
### Shape Analysis
| Field | Type | Description |
| ---------------------- | ------ | --------------------------------------------------------------- |
| `shape_analysis.shape` | String | Detected shape type: `circle`, `rectangle`, `line`, `irregular` |
| `shape_analysis.size` | Object | Shape-specific measurements (varies by shape type) |
#### Shape-Specific Size Fields
**Circle:**
- `radius` - Circle radius in meters
- `diameter` - Circle diameter in meters
- `area` - Circle area in square meters
- `circumference` - Circle circumference in meters
**Rectangle:**
- `width` - Rectangle width in meters
- `height` - Rectangle height in meters
- `area` - Rectangle area in square meters
- `perimeter` - Rectangle perimeter in meters
- `corner_points` - Array of [x,y] corner coordinates
**Line:**
- `length` - Line length in meters
- `endpoints` - Array of two [x,y] endpoint coordinates
- `width_spread` - Standard deviation of perpendicular distances
**Irregular:**
- `bounding_width` - Bounding box width in meters
- `bounding_height` - Bounding box height in meters
- `bounding_area` - Bounding box area in square meters
- `point_spread` - Standard deviation of distances from centroid
### Velocity Analysis
| Field | Type | Description |
| ---------------------------- | ------------ | ------------------------------------------- |
| `movement_type` | String | Classified movement pattern |
| `average_velocity` | Array[Float] | [vx, vy, vz] average velocity vector in m/s |
| `velocity_magnitude` | Float | Average speed magnitude in m/s |
| `movement_direction_degrees` | Float | Movement direction in degrees (-180 to 180) |
| `velocity_coherence` | Float | Movement synchronization measure (0-1) |
### Tracking Metadata
| Field | Type | Description |
| ------------------------------- | ------- | ------------------------------------------------------- |
| `tracking.tracking_id` | String | Persistent cluster UUID (same as cluster_id) |
| `tracking.state` | String | Current lifecycle state (new/active/stable/fading/lost) |
| `tracking.confidence` | Float | Tracking confidence score (0-1) |
| `tracking.stability_score` | Float | Cluster stability metric (0-1) |
| `tracking.frames_detected` | Integer | Total frames where cluster was detected |
| `tracking.frames_missed` | Integer | Consecutive frames where cluster was not detected |
| `tracking.age_seconds` | Float | Time since first detection (seconds) |
| `tracking.time_since_last_seen` | Float | Time since last detection (seconds) |
| `tracking.first_seen` | Float | Unix timestamp of first detection |
| `tracking.last_seen` | Float | Unix timestamp of last detection |
| `tracking.predicted_position.x` | Float | Predicted X coordinate for next frame |
| `tracking.predicted_position.y` | Float | Predicted Y coordinate for next frame |
### Tracking Statistics
| Field | Type | Description |
| ---------------------------------------- | ------- | ----------------------------------------- |
| `tracking_statistics.active_clusters` | Integer | Total active clusters across all scenes |
| `tracking_statistics.archived_clusters` | Integer | Total archived (lost) clusters |
| `tracking_statistics.clusters_by_state` | Object | Count of clusters in each lifecycle state |
| `tracking_statistics.tracked_scenes` | Integer | Number of scenes with active clusters |
| `tracking_statistics.tracked_categories` | Integer | Number of object categories being tracked |
### Movement Pattern Classifications
| Pattern | Description | Criteria |
| ---------------------- | ----------------------- | -------------------------------------------- |
| `stationary` | Minimal movement | Average speed < 0.1 m/s |
| `coordinated_parallel` | Synchronized movement | Velocity coherence > 0.3 |
| `converging` | Moving toward center | >60% objects moving toward cluster center |
| `diverging` | Moving away from center | >60% objects moving away from cluster center |
| `loosely_coordinated` | Some coordination | Velocity coherence 0.2-0.3 |
| `chaotic` | Random movement | Low velocity coherence, mixed directions |
### Administrative Fields
| Field | Type | Description |
| --------------------------- | ------------- | ------------------------------------------------------- |
| `object_ids` | Array[String] | List of individual object IDs in the cluster |
| `dbscan_params.eps` | Float | DBSCAN epsilon parameter used for this category |
| `dbscan_params.min_samples` | Integer | DBSCAN minimum samples parameter used for this category |
| `dbscan_params.category` | String | Object category for which parameters were optimized |
## Production Data Analysis
### Real Deployment Performance
Based on actual production deployment on `broker.scenescape.intel.com`:
- **Active Scenes**: "Queuing" (`302cf49a-97ec-402d-a324-c5077b280b7b`), "Retail" (`3bc091c7-e449-46a0-9540-29c499bca18c`)
- **Object Volume**: 62 person objects per frame in busy queuing scenarios
- **Cluster Formation**: Typically 2 clusters formed (42-43 objects in main cluster, 4 objects in secondary cluster)
- **Noise Points**: 15-17 unclustered objects (24-27% noise ratio)
- **Shape Patterns**: 100% circle formations observed in production
- **Movement Types**: Mix of "chaotic" (main clusters) and "stationary" (small clusters)
### Performance Characteristics
- **Processing Speed**: Real-time analysis of 60+ objects per frame
- **Network Connectivity**: Reliable MQTT connectivity to production broker
- **Shape Detection**: Consistent circle detection with radius measurements 0.16-0.87 meters
- **Velocity Analysis**: Accurate movement classification with coherence measurements
## Usage Examples
### Real-time Monitoring
Subscribe to the ANALYTICS_CLUSTERS topic to receive live cluster updates:
```bash
mosquitto_sub -h broker.scenescape.intel.com -t "scenescape/analytics/clusters/+" -v
```
### Processing Cluster Data
Example Python code to process cluster metadata with tracking information:
```python
import json
import paho.mqtt.client as mqtt
def on_message(client, userdata, message):
try:
cluster_batch = json.loads(message.payload.decode())
scene_name = cluster_batch['scene_name']
scene_id = cluster_batch['scene_id']
total_clusters = cluster_batch['total_clusters']
print(f"\n=== Scene: {scene_name} ({scene_id}) ===")
print(f"Total Clusters: {total_clusters}")
# Process tracking statistics
stats = cluster_batch.get('tracking_statistics', {})
print(f"\nTracking Statistics:")
print(f" Active Clusters: {stats.get('active_clusters', 0)}")
print(f" Archived Clusters: {stats.get('archived_clusters', 0)}")
state_counts = stats.get('clusters_by_state', {})
print(f" States: {state_counts}")
# Process individual clusters
for cluster in cluster_batch['clusters']:
cluster_id = cluster['cluster_id']
category = cluster['category']
object_count = cluster['objects_in_cluster']
# Tracking information
tracking = cluster['tracking']
state = tracking['state']
confidence = tracking['confidence']
stability = tracking['stability_score']
age_seconds = tracking['age_seconds']
print(f"\n--- Cluster {cluster_id[:8]}... ---")
print(f" Category: {category}")
print(f" Objects: {object_count}")
print(f" State: {state}")
print(f" Confidence: {confidence:.3f}")
print(f" Stability: {stability:.3f}")
print(f" Age: {age_seconds:.1f}s")
print(f" Frames Detected: {tracking['frames_detected']}")
print(f" Frames Missed: {tracking['frames_missed']}")
# Movement and shape analysis
movement_type = cluster['velocity_analysis']['movement_type']
shape = cluster['shape_analysis']['shape']
print(f" Movement: {movement_type}")
print(f" Shape: {shape}")
# Shape-specific measurements
if shape == "circle":
radius = cluster['shape_analysis']['size']['radius']
print(f" Circle radius: {radius:.2f}m")
elif shape == "rectangle":
width = cluster['shape_analysis']['size']['width']
height = cluster['shape_analysis']['size']['height']
print(f" Rectangle: {width:.2f}m x {height:.2f}m")
# Predicted position for next frame
pred_pos = tracking['predicted_position']
if pred_pos['x'] is not None:
print(f" Predicted Position: ({pred_pos['x']:.2f}, {pred_pos['y']:.2f})")
except Exception as e:
print(f"Error processing cluster data: {e}")
import traceback
traceback.print_exc()
client = mqtt.Client()
client.on_message = on_message
client.connect("broker.scenescape.intel.com", 1883, 60)
client.subscribe("scenescape/analytics/clusters/+")
client.loop_forever()
```
## **Cluster Tracking Algorithm**
### Overview
The Cluster Analytics service implements cluster tracking system to maintain cluster identities across video frames. This enables long-term analysis of cluster behavior, movement patterns, and lifecycle dynamics.
### Tracking Pipeline
```mermaid
graph TD
A[New Frame Detection] --> B[Group by Category]
B --> C[Get Existing Clusters]
C --> D[Hungarian Matching]
D --> E{Match Found?}
E -->|Yes| F[Update Cluster]
E -->|No| G[Create New Cluster]
F --> H[Update Confidence]
G --> I[Initialize with NEW state]
H --> J[Update State Machine]
I --> J
J --> K[Update History]
K --> L[Predict Next Position]
L --> M{Check Unmatched Clusters}
M --> N[Mark as Missed]
N --> O[Reduce Confidence]
O --> P[Update State]
P --> Q[Archive if LOST]
```
### Hungarian Matching Algorithm
The system uses the Hungarian algorithm with a multi-feature cost matrix to optimally match new detections to existing tracked clusters:
**Cost Calculation:**
```python
# Hard constraint: must be same category
if tracked.category != detection.category:
return INFINITE_COST
# Multi-feature cost matrix (weighted)
position_cost = distance(predicted_position, detection_position) * 0.4
velocity_cost = distance(tracked_velocity, detection_velocity) * 0.3
size_cost = abs(tracked_size - detection_size) * 0.2
shape_cost = (1.0 if shapes_match else 2.0) * 0.1
total_cost = position_cost + velocity_cost + size_cost + shape_cost
```
**Matching Process:**
1. Build cost matrix for all (cluster, detection) pairs
2. Apply Hungarian algorithm for optimal assignment
3. Filter matches by maximum distance threshold (default: 5.0 meters)
4. Return valid matches with similarity scores
### State Machine Transitions
```mermaid
stateDiagram-v2
[*] --> NEW: Detection
NEW --> ACTIVE: 3+ frames detected
confidence > 0.6
ACTIVE --> STABLE: 20+ frames detected
stability > 0.7
ACTIVE --> FADING: 15+ frames missed
STABLE --> FADING: 15+ frames missed
FADING --> ACTIVE: Redetected
FADING --> LOST: 10+ frames missed
LOST --> [*]: Archive after 5s
```
### Confidence Metrics
**Detection Consistency:**
- Base confidence = frames_detected / total_frames
- Represents overall detection reliability
**Miss Penalty:**
- Penalty = min(frames_missed \* 0.1, 0.5)
- Reduces confidence for recent detection failures
**Longevity Bonus:**
- Bonus = min(frames_detected / 100, 0.2)
- Rewards long-term stable tracking
**Final Confidence:**
```python
confidence = clamp(base_confidence - miss_penalty + longevity_bonus, 0.0, 1.0)
```
### Stability Score
Measures cluster consistency based on recent history (last 10 observations):
**Position Stability:**
- Low position variance indicates stable location
- `position_stability = 1.0 / (1.0 + position_variance)`
**Size Stability:**
- Consistent cluster size over time
- `size_stability = 1.0 / (1.0 + size_variance)`
**Shape Consistency:**
- Frequency of most common shape
- `shape_consistency = most_common_count / total_observations`
**Combined Score:**
```python
stability_score = (
0.4 * position_stability +
0.3 * size_stability +
0.3 * shape_consistency
)
```
### History Management
Each tracked cluster maintains historical observations:
**Stored Data:**
- Position history: (x, y, timestamp)
- Velocity history: (vx, vy, timestamp)
- Size history: object counts
- Shape history: detected shapes
- Timestamps: frame timestamps
**Limits:**
- Maximum history size: 100 observations
- Automatic truncation when limit exceeded
- Maintains most recent observations
### Prediction System
Clusters use linear extrapolation for position prediction:
```python
# Calculate average velocity from recent history (last 5 observations)
avg_velocity = mean(recent_velocities)
# Predict next position (assuming ~1 frame time delta)
predicted_position = current_position + avg_velocity
```
**Benefits:**
- Improves matching accuracy for moving clusters
- Handles temporary occlusions
- Reduces false negatives in tracking
### Archival System
**Archival Criteria:**
- Cluster state = LOST
- Time since last seen > 5.0 seconds (configurable)
**Archive Management:**
- Maximum 50 archived clusters (global limit)
- Oldest archived clusters removed when limit exceeded
- Preserves full history for analysis
**Statistics Tracking:**
- Active clusters count
- Archived clusters count
- Clusters by state distribution
- Tracked scenes and categories
## DBSCAN Noise Point Explanation
In the DBSCAN clustering algorithm, **noise points** are objects that do not belong to any cluster. Understanding noise points is important for interpreting analytics results in the Cluster Analytics microservice.
### DBSCAN Algorithm Overview
DBSCAN (Density-Based Spatial Clustering of Applications with Noise) classifies each data point as one of:
- **Core points**: Have at least `min_samples` neighbors within `eps` distance.
- **Border points**: Are within `eps` distance of a core point but do not have enough neighbors to be core points themselves.
- **Noise points**: Are neither core nor border points—these are isolated from other points.
### Noise Points in Cluster Analytics
In this service, noise points are objects that:
- Are farther than the configured `eps` distance (e.g., 1.5 meters) from any other object of the same category.
- Do not have enough nearby neighbors to form a cluster (fewer than `min_samples`).
**Example Scenarios:**
- **Queuing Scene**:
- 5 people detected.
- 3 people stand close together (within 1.5m): form 1 cluster.
- 2 people stand alone, each more than 1.5m from others: these are noise points.
- **Retail Scene**:
- 4 people detected.
- 2 people are near each other: form 1 cluster.
- 2 people are isolated: noise points.
### Code Representation
In DBSCAN output, objects labeled with `-1` are noise points. These represent people or objects that are spatially isolated and do not form meaningful groups with others of the same category.
### Why Noise Points Matter
Identifying noise points helps distinguish between:
- **Clustered behavior**: People or objects grouping together.
- **Individual behavior**: People or objects standing alone or isolated.
This distinction is valuable for analytics, enabling insights into both group dynamics and solitary activity within a scene.
### Logging Benefits
- **Reduced Log Volume**: Eliminates verbose JSON serialization in production
- **Performance**: Avoids expensive string formatting when not needed
- **Operational**: Clear cluster summaries for monitoring and alerting
- **Debugging**: Full metadata available when debug logging is enabled
## Contributing
When contributing to the Cluster Analytics service:
1. **Algorithm Improvements**: Enhance clustering accuracy or add new shape detection patterns
2. **Performance Optimization**: Optimize processing speed for high-volume scenarios
3. **New Movement Patterns**: Add additional velocity analysis classifications
4. **Testing**: Include unit tests for clustering and shape detection algorithms
## License
This project is licensed under the Apache 2.0 License. See the LICENSE file for details.
:::{toctree}
:hidden:
get-started.md
:::