Mission-Control/README.md

8.1 KiB

Mission Control

A monitoring dashboard for OpenClaw AI agents — built with Python/FastAPI and Next.js.

Why This Exists

When you run OpenClaw seriously — multiple agents, dozens of cron jobs, sub-agents spawning sub-agents, several Telegram groups and WhatsApp, Slack, and Discord channels, 10+ models — information gets scattered fast.

The problem: there was no single place to answer the obvious questions:

  • Is my gateway actually running right now?
  • How much have I spent today, and which model is burning the most?
  • Which cron jobs ran, which failed, and when does the next one fire?
  • What sessions are active and how much context are they consuming?
  • Are my sub-agents doing useful work or spinning in circles?
  • What's the cost trend over the last 7 days — am I accelerating?

The solution: a single dashboard that collects everything in one place — gateway health, costs, cron status, active sessions, sub-agent runs, model usage — refreshed automatically, org-scoped, no login required for local development. Open a browser tab, get the full picture in seconds.

It's not trying to replace the OpenClaw CLI or Telegram interface. It's the at-a-glance overview layer that tells you whether everything is healthy and where your money and compute are going — so you can make decisions without hunting for data.

Features

6 Core Monitoring Panels

  1. 💰 Cost Cards & Breakdown — Today's cost, all-time cost, projected monthly, per-model cost breakdown with 7d/30d/all-time tabs. Know exactly which model is burning your budget.
  2. 💚 System Health — Gateway status (online/offline), PID, uptime, memory, compaction mode, CPU/RAM/swap/disk gauges. See at a glance whether your gateway is healthy.
  3. Cron Jobs — All scheduled jobs with status, schedule, last/next run, duration, model. Spot failures instantly and see when the next fire is.
  4. 📡 Active Sessions — Recent sessions with model, type badges (DM/group/cron/subagent), context % bars, token counts. See who's consuming what.
  5. 🤖 Sub-Agent Activity — Sub-agent runs with cost, duration, status + token breakdown (7d/30d tabs). Know whether sub-agents are productive or spinning.
  6. 📈 Cost Trends — Cost trend line over 7d/30d, model cost breakdown bars, acceleration indicators. Catch spending spikes before they hurt.

Architecture

  • Backend: Python/FastAPI + PostgreSQL + Redis
  • Frontend: Next.js 16 + React 19 + Tailwind CSS + shadcn/ui
  • Data collection: Background gateway collector polling OpenClaw RPC endpoints
  • Real-time: WebSocket endpoint for live agent events
  • Data processing: Pure Python functions ported from Go dashboard logic (model name normalization, daily chart aggregation, alert computation, token formatting)

API Endpoints

Monitoring Summary Endpoints

Endpoint Method Description
/api/v1/monitoring/cost-summary GET Today's cost, all-time cost, projected monthly
/api/v1/monitoring/cost-breakdown GET Per-model cost breakdown (7d/30d/all)
/api/v1/monitoring/health-summary GET Gateway status, system metrics, health gauges
/api/v1/monitoring/cron-summary GET Cron job statuses, schedules, run history
/api/v1/monitoring/sessions-summary GET Active sessions with model, context %, tokens
/api/v1/monitoring/sub-agents-summary GET Sub-agent runs with cost, duration, status
/api/v1/monitoring/trends GET Cost trends, model breakdown (7d/30d)

Monitoring CRUD Endpoints

Endpoint Method Description
/api/v1/monitoring/cost-snapshots GET Paginated cost snapshot records
/api/v1/monitoring/cron-jobs GET Paginated cron job status records
/api/v1/monitoring/sessions GET Paginated session event records
/api/v1/monitoring/health GET Paginated system health metrics
/api/v1/monitoring/sub-agents GET Paginated sub-agent run records

WebSocket

Endpoint Description
/ws/agents Real-time agent events (initial snapshot + polling)

Gateway RPC Integration

The collector service polls these OpenClaw gateway RPC endpoints:

  • usage.cost + usage.status → Cost snapshots
  • cron.list → Cron job status
  • sessions.list + sessions.preview → Session events
  • health + status → System health metrics

Quick Start

git clone https://forgejo/null/Mission-Control.git
cd Mission-Control
cp .env.example .env
docker compose up -d

The backend runs on port 8080, frontend on port 3037.

Environment Variables

Variable Default Description
COLLECTION_INTERVAL_COST 300 Seconds between cost collection
COLLECTION_INTERVAL_CRON 60 Seconds between cron collection
COLLECTION_INTERVAL_SESSION 30 Seconds between session collection
COLLECTION_INTERVAL_HEALTH 60 Seconds between health collection
LOCAL_AUTH_TOKEN Token for local dev auth
POSTGRES_HOST db PostgreSQL host
POSTGRES_PORT 5432 PostgreSQL port
POSTGRES_DB mission_control Database name
REDIS_URL redis://redis:6379/0 Redis connection URL

Project Structure

Mission-Control/
├── src/
│   ├── backend/
│   │   ├── app/
│   │   │   ├── api/           # API routes (monitoring, ws, gateways, etc.)
│   │   │   ├── models/        # SQLModel database models
│   │   │   ├── schemas/       # Pydantic request/response schemas
│   │   │   ├── services/
│   │   │   │   └── monitoring/
│   │   │   │       ├── gateway_collector.py  # Background RPC collector
│   │   │   │       ├── data_processing.py    # Dashboard data transforms
│   │   │   │       ├── event_parser.py       # Session event parser
│   │   │   │       └── models.py            # Pydantic RPC response models
│   │   │   └── main.py        # FastAPI app + lifespan
│   │   ├── migrations/        # Alembic migrations
│   │   └── tests/
│   └── frontend/              # Next.js app
│       └── src/
│           ├── app/            # Next.js App Router pages
│           ├── components/     # React components
│           ├── api/            # Generated API clients
│           └── lib/            # Utilities
├── sources/                    # Reference repos (Go, Node)
├── docker-compose.yml
├── Dockerfile
└── PROJECT.md                  # Full 4-phase implementation plan

Data Collection Flow

OpenClaw Gateway
       │
       │ RPC (usage.cost, cron.list, sessions.list, health, status)
       ▼
GatewayCollectorService (background asyncio task)
       │
       │ Upsert into PostgreSQL
       ▼
Monitoring Models (CostSnapshot, CronJobStatus, SessionEvent, SubAgentRun, SystemHealthMetric)
       │
       │ API endpoints + data_processing transforms
       ▼
Dashboard Frontend (Next.js)
       │
       │ WebSocket for real-time events
       ▼
Live Agent Activity Panel

Source Repos

Mission Control ports functionality from two OpenClaw dashboard projects:

Key decision: No new backend languages. Go and Node functionality ports to Python/FastAPI within Mission Control's backend. We reuse the gateway RPC transport and data model shapes, but port all processing/aggregation logic as pure Python functions.

Development

# Backend
cd src/backend
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8080

# Frontend
cd src/frontend
npm install
npm run dev

License

MIT