Command Reference#
Monitoring Modes#
Mode |
Tracks |
Overhead |
Use when |
|---|---|---|---|
Thread (default) |
Individual threads (TIDs) |
~5–10% |
Debugging, optimization |
PID ( |
Processes only |
~2–3% |
Production, long-term runs |
Quick Reference#
Task |
Command |
Duration |
|---|---|---|
Quick check |
|
30 s |
Full monitor |
|
60 s |
Full monitor (PID mode) |
|
60 s |
Monitor specific node |
|
60 s |
Extended session |
|
5 min |
Graph only |
|
60 s |
Resources only (threads) |
|
60 s |
Resources only (PIDs) |
|
60 s |
Remote system |
|
60 s |
Remote system (PID mode) |
|
60 s |
Pipeline graph (PNG) |
|
— |
Pipeline graph (session) |
|
— |
List sessions |
|
— |
Re-visualize last session |
|
— |
Clean all data |
|
— |
All make targets accept optional variables: NODE=, DURATION=, INTERVAL=,
SESSION=, REMOTE_IP=, and REMOTE_USER=.
make monitor NODE=/slam_toolbox DURATION=120 INTERVAL=2
make monitor-remote REMOTE_IP=192.168.1.100 NODE=/slam_toolbox REMOTE_USER=ros
monitor_stack.py#
uv run python src/monitor_stack.py [OPTIONS]
Option |
Description |
|---|---|
|
Narrow graph discovery to one node (proc delay measured for all nodes) |
|
Name for this session (default: timestamp) |
|
Auto-stop after N seconds |
|
Update interval (default: 5) |
|
Where to save results |
|
Skip resource monitoring |
|
Skip graph monitoring |
|
Process-level only, no thread details |
|
Skip auto-visualization on exit |
|
Monitor a remote machine |
|
SSH user for remote machine (default: ubuntu) |
|
List previous sessions and exit |
uv run python src/monitor_stack.py --node /slam_toolbox --session my_test --duration 120
uv run python src/monitor_stack.py --remote-ip 192.168.1.100 --node /slam_toolbox
uv run python src/monitor_stack.py --resources-only --pid-only --duration 60
ros2_graph_monitor.py#
uv run python src/ros2_graph_monitor.py # All nodes
uv run python src/ros2_graph_monitor.py --node /slam_toolbox # Scope to one node
uv run python src/ros2_graph_monitor.py --node /ctrl --log t.csv # With CSV logging
uv run python src/ros2_graph_monitor.py --interval 2 # Custom interval
uv run python src/ros2_graph_monitor.py --remote-ip 192.168.1.100
monitor_resources.py#
uv run python src/monitor_resources.py # CPU only
uv run python src/monitor_resources.py --memory --threads # CPU + memory + threads
uv run python src/monitor_resources.py --memory --log out.log # With logging
uv run python src/monitor_resources.py --list # List ROS2 processes
uv run python src/monitor_resources.py --remote-ip 192.168.1.100 --memory
visualize_timing.py#
uv run python src/visualize_timing.py timing.csv --delays --frequencies --output-dir ./plots/
Option |
Description |
|---|---|
|
Message arrival scatter plot |
|
Topic message rates over time |
|
Processing delay over time |
|
Inter-message timing / jitter |
|
Save plots as PNG (omit to display interactively) |
|
Print statistics only, no plots |
visualize_resources.py#
uv run python src/visualize_resources.py resource.log --cores --heatmap --top 10 --output-dir ./plots/
uv run python src/visualize_resources.py resource.log --summary
Option |
Description |
|---|---|
|
CPU utilization per core over time |
|
CPU utilization per PID/thread (top N) |
|
Core utilization heatmap |
|
Thread-to-core scatter plot |
|
Number of top threads to show (default: 10) |
|
Save plots as PNG |
|
Print statistics only, no plots |
Note:
pidstatreports CPU% where 100% = 1 full core. On a 20-core system the maximum is 2000%. Use the Avg Cores column in--summaryoutput for a human-readable reading.
visualize_graph.py#
Renders the ROS2 computation graph as a directed topology diagram.
# Headless PNG
uv run python src/visualize_graph.py monitoring_sessions/<name> --no-show --output graph.png
# Interactive (click nodes to see topic detail popups)
uv run python src/visualize_graph.py monitoring_sessions/<name> --show
Or via make:
make pipeline-graph
make pipeline-graph SESSION=20260306_154140
Grafana Dashboard Commands#
Command |
Description |
|---|---|
|
Start Grafana + Prometheus (Docker) |
|
Stop the stack |
|
Check services — shows URL http://localhost:30000 |
|
Export session metrics to Prometheus |
|
Continuously export live monitoring data |
|
Open dashboard in browser |
Metrics are exposed on port 9092 (Prometheus occupies 9090 in
host-network mode). Prometheus is pre-configured to scrape localhost:9092.
Remote Monitoring#
Component |
How it works |
|---|---|
Graph monitor |
DDS peer discovery via |
Resource monitor |
Runs |
Results are stored and visualized locally on the monitoring machine.
make monitor-remote REMOTE_IP=192.168.1.100
make monitor-remote REMOTE_IP=192.168.1.100 REMOTE_USER=ros NODE=/slam_toolbox
uv run python src/monitor_stack.py --remote-ip 192.168.1.100 --pid-only --duration 120
Troubleshooting#
Problem |
Fix |
|---|---|
No ROS2 processes found |
Run |
Monitor exits immediately |
Source ROS2: |
Visualizations not generated |
Run |
Permission denied |
Run |
Remote: no data |
Check SSH auth and matching |
CPU shows e.g. “563%” |
Normal — 100% = 1 core. Check Avg Cores column. |
|
|
Graph click does nothing |
Use |