# Building from Source

This guide explains how to build Metrics Manager from source code for development, customization, or air-gapped deployments.

## Prerequisites

Before building, ensure you have:

- **System Requirements**: See [System Requirements](./system-requirements.md)
- **Source code**: Cloned the repository (`git clone https://github.com/open-edge-platform/edge-ai-libraries.git -b main`)
- **Platform familiarity**: Basic understanding of Docker and Docker Compose

## Building in a Docker Container (Recommended)

The recommended way to build is inside Docker, which handles all dependencies (Rust toolchain, Telegraf, Python) automatically.

### Step 1: Navigate to the Project

```bash
cd edge-ai-libraries/metrics-manager
```

### Step 2: Copy Environment Configuration

```bash
cp .env.example .env
```

Edit `.env` if needed (usually defaults work fine).

### Step 3: Build the Docker Image

```bash
docker compose build
```

This runs a multi-stage build:

1. **Stage 1**: Compiles qmassa (Intel GPU reader) from Rust source
2. **Stage 2**: Installs Python dependencies (production only, no test packages)
3. **Stage 3**: Creates a test image with test dependencies (optional)
4. **Stage 4**: Production image based on `python:3.12-slim` with Telegraf, qmassa, and supervisord

**First build duration**: 3–10 minutes (depends on download speeds and CPU)

**Subsequent builds**: <1 minute (cached layers)

### Step 4: Start the Service

```bash
docker compose up
```

Or in the background:

```bash
docker compose up -d
```

### Step 5: Verify

```bash
curl http://localhost:9090/health
```

---

## Building Locally (Without Docker)

If you want to run the service locally without Docker, use the following approach:

### Prerequisites

- **Python 3.10+** (`python3 --version`)
- **uv** (fast Python package manager): `pip install uv` or see [UV Documentation](https://docs.astral.sh/uv/)
- **Telegraf** installed on the system (for system metrics)
- **qmassa binary** (for GPU metrics) — compile from [qmassa GitHub](https://github.com/ulissesf/qmassa) or skip if not needed
- **Git**

### Step 1: Clone and Navigate

```bash
git clone https://github.com/open-edge-platform/edge-ai-libraries.git -b main
cd edge-ai-libraries/metrics-manager
```

### Step 2: Create a Virtual Environment

```bash
python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
```

Or with uv (faster):

```bash
uv venv
source .venv/bin/activate
```

### Step 3: Install Dependencies

```bash
uv sync --group test
```

Or with pip:

```bash
pip install -e ".[test]"
```

### Step 4: Configure the Environment

```bash
cp .env.example .env
```

Edit `.env` and set:

```bash
# Since Telegraf runs on the host (not in a container), use localhost
TELEGRAF_PORT=9273
TELEGRAF_HTTP_ENDPOINT=http://localhost:8186/write
PROMETHEUS_TELEGRAF_ENDPOINT=http://localhost:9273
```

### Step 5: Start Telegraf

On your host machine, start Telegraf with the bundled config:

```bash
telegraf --config telegraf.conf
```

This exposes metrics on `http://localhost:9273/metrics` and listens for writes on `http://localhost:8186/write`.

### Step 6: Run the Application

```bash
uvicorn app.main:app --reload --port 9090
```

The service will start on `http://localhost:9090`.

### Step 7: Verify

```bash
curl http://localhost:9090/health
curl http://localhost:9273/metrics | head
```

---

## Running Tests

### Via Docker (Recommended)

```bash
# Run all tests
docker compose --profile test run --rm metrics-manager-test

# Run with coverage
docker compose --profile test run --rm metrics-manager-test \
  python -m pytest tests/ -v --cov=app --cov-report=term-missing
```

### Locally

If you installed dependencies locally:

```bash
# Run all tests
pytest

# Run with verbose output
pytest -v

# Run with coverage
pytest --cov=app --cov-report=html
```

### Expected Output

```
========================= test session starts ==========================
tests/test_settings.py::...              PASSED
tests/test_logging_config.py::...        PASSED
tests/test_main.py::...                  PASSED
tests/test_models.py::...                PASSED
tests/test_store.py::...                 PASSED
tests/test_routes.py::...                PASSED
tests/test_metrics.py::...               PASSED
tests/test_rate_limit.py::...            PASSED
tests/test_sse.py::...                   PASSED
tests/test_npu_monitor_tool.py::...      PASSED
tests/test_npu_reader.py::...            PASSED
tests/test_telegraf_integration.py::...  SKIPPED (requires Docker)
========================= 179 passed in 1.70s ==========================
```

---

## Development Setup

For local development with hot-reload and debugging:

```bash
# Install dev dependencies
uv sync --group test --group dev

# Run with auto-reload
uvicorn app.main:app --reload --port 9090

# Run linting and formatting
black app/
ruff check app/
```

## Building the Helm Chart

If you are deploying to Kubernetes:

```bash
# Lint the Helm chart
make helm-lint

# Generate the chart package
make helm-package

# Push to OCI registry (requires authentication)
make helm-push
```

See [Helm Deployment](./deploy-with-helm.md) for full Kubernetes instructions.

## Customization

### Modifying Telegraf Configuration

The `telegraf.conf` file controls system metric collection. To customize:

1. Edit `telegraf.conf` in the metrics-manager root directory
2. Or mount a custom config: `TELEGRAF_CONFIG=./my-telegraf.conf docker compose up`
3. Or drop additional `.conf` files in the `telegraf.d/` directory

See [Environment Variables](./environment-variables.md) for Telegraf customization examples.

### Extending with Custom Inputs

Add Python or shell scripts to `/app/custom-metrics/`:

```bash
# Example: create a fan speed monitor
cat > /app/custom-metrics/fan_speed.sh << 'EOF'
#!/bin/sh
rpm=$(cat /sys/class/hwmon/hwmon0/fan1_input)
echo "fan_speed,sensor=cpu_fan rpm=${rpm}i"
EOF
chmod +x /app/custom-metrics/fan_speed.sh
```

The script runs every 10 seconds and outputs InfluxDB Line Protocol.

## Troubleshooting Build Issues

| Error                       | Solution                                                                                       |
| --------------------------- | ---------------------------------------------------------------------------------------------- |
| `Rust toolchain not found`  | First Docker build compiles Rust. Try `docker compose build --no-cache`.                       |
| `Module not found: app`     | Ensure you're in the `metrics-manager/` directory and have run `uv sync` or `pip install -e .` |
| `Port 9090 already in use`  | Change port in `.env`: `HOST_METRICS_PORT=19090`                                               |
| `Telegraf not found`        | Install Telegraf: `apt-get install telegraf` (Debian/Ubuntu) or use Docker                     |
| `Permission denied on /sys` | Ensure you run with `--privileged` (Docker) or as root (local)                                 |

## Supporting Resources

- [Get Started Guide](../get-started.md)
- [System Requirements](./system-requirements.md)
- [Testing Guide](./testing.md)
- [Environment Variables](./environment-variables.md)
- [Helm Deployment](./deploy-with-helm.md)
- [Troubleshooting](../troubleshooting.md)

## License

Copyright (C) 2025-2026 Intel Corporation

SPDX-License-Identifier: Apache-2.0