ROS2 KPI Monitoring Overview#
Monitor, analyze, and visualize Key Performance Indicators in ROS2 systems — node latencies, CPU/memory usage, message flow, and thread-level resource distribution.
Features#
Real-time ROS2 graph monitoring: nodes, topics, message rates, processing delays
Automatic per-node input→output processing delay for every node in the graph (no
--nodeflag required)CPU, memory, and I/O monitoring via
pidstat(thread-level or PID-only)Cross-machine monitoring via
--remote-ip(DDS peer discovery + SSH)Interactive visualizations: heatmaps, timelines, core utilization, scatter plots
ROS bag analysis with latency tracking and CPU-cycle estimation
Organized session output with auto-generated visualizations
Prerequisites#
Requirement |
Details |
|---|---|
ROS2 Humble / Jazzy |
See Getting Started |
Python 3.8+ |
Included with Ubuntu 22.04+ |
|
|
|
Installed via |
Architecture#
The monitoring stack uses a two-layer design:
┌──────────────────────────────────┐
│ ROS2 System (Local/Remote) │
│ Node A Node B Node C ... │
└──────────┬──────────┬────────────┘
│ DDS │ SSH
┌──────────▼──────────▼────────────┐
│ Monitoring Stack │
│ monitor_stack.py (Orchestrator) │
│ ├── ros2_graph_monitor.py │
│ │ → graph_timing.csv │
│ └── monitor_resources.py │
│ → resource_usage.log │
│ Auto-Visualization on exit │
└──────────────────────────────────┘
monitor_stack.py orchestrates both monitors and saves all output to a
dated session folder, then auto-generates visualizations on exit.
ros2_graph_monitor.py subscribes to all ROS2 topics, measures message
rates and per-node input→output processing delays for every node in the graph,
and logs timing data to CSV.
monitor_resources.py detects ROS2 processes and uses pidstat to sample
CPU, memory, and I/O statistics at thread or process level.
Scripts Overview#
monitor_stack.py — Unified Entry Point#
uv run python src/monitor_stack.py [OPTIONS]
Option |
Description |
|---|---|
|
Monitor a specific node (e.g. |
|
Session label (default: timestamp) |
|
Auto-stop after N seconds |
|
Update interval (default: 5) |
|
Where to save results |
|
Skip resource monitoring |
|
Skip graph monitoring |
|
Process-level only, no thread details |
|
Skip auto-visualization on exit |
|
Monitor a remote machine |
|
SSH user for remote machine (default: ubuntu) |
|
List previous sessions and exit |
ros2_graph_monitor.py — Graph and Latency Monitor#
Measures message rates and per-node input→output processing delays. Processing
delay is computed for each node automatically — no --node filter needed.
Option |
Description |
|---|---|
|
Narrow graph discovery to one node |
|
Update interval (default: 5) |
|
Save timing data to CSV |
|
Show per-node delay summary table |
|
Show topic statistics table |
|
Configure DDS peer discovery for a remote host |
monitor_resources.py — CPU / Memory / I/O Monitor#
Option |
Description |
|---|---|
|
List detected ROS2 processes and exit |
|
Sampling interval, integer ≥ 1 (default: 1) |
|
Include memory statistics |
|
Include I/O statistics |
|
Per-thread statistics |
|
Append output to log file |
|
Run |
Visualization Scripts#
Script |
Purpose |
|---|---|
|
CPU/memory plots, heatmaps, thread-core mapping |
|
Message timestamps, frequencies, and delay plots |
|
Interactive ROS2 computation graph topology diagram |
|
Aggregate statistics across multiple sessions |
visualize_graph.py — Interactive Pipeline Graph#
Renders the full ROS2 computation graph as a directed topology diagram. Nodes are color-coded by category; topics are shown as labelled edges.
./src/visualize_graph.py SESSION_DIR [OPTIONS]
Run with --show to enable an interactive window where you can:
Hover over nodes and topics for tooltips
Click a node to see a detail popup with published/subscribed topics, message count, frequency (Hz), and latency mean ± std
Color-coded health indicators (green / yellow / orange / red)
Session Data Layout#
All output is saved in timestamped session folders:
monitoring_sessions/
└── 20260306_154140/
├── session_info.txt # Test configuration
├── graph_timing.csv # Topic timing data
├── resource_usage.log # CPU/memory usage
└── visualizations/ # Auto-generated PNG plots