diff --git a/README.md b/README.md new file mode 100644 index 0000000..5aacbf4 --- /dev/null +++ b/README.md @@ -0,0 +1,184 @@ +# Mission Control + +A monitoring dashboard for [OpenClaw](https://github.com/openclaw/openclaw) AI agents — built with Python/FastAPI and Next.js. + +## Why This Exists + +When you run OpenClaw seriously — multiple agents, dozens of cron jobs, sub-agents spawning sub-agents, several Telegram groups and WhatsApp, Slack, and Discord channels, 10+ models — information gets scattered fast. + +**The problem:** there was no single place to answer the obvious questions: + +- **Is my gateway actually running right now?** +- **How much have I spent today, and which model is burning the most?** +- **Which cron jobs ran, which failed, and when does the next one fire?** +- **What sessions are active and how much context are they consuming?** +- **Are my sub-agents doing useful work or spinning in circles?** +- **What's the cost trend over the last 7 days — am I accelerating?** + +**The solution:** a single dashboard that collects everything in one place — gateway health, costs, cron status, active sessions, sub-agent runs, model usage — refreshed automatically, org-scoped, no login required for local development. Open a browser tab, get the full picture in seconds. + +It's not trying to replace the OpenClaw CLI or Telegram interface. It's the at-a-glance overview layer that tells you whether everything is healthy and where your money and compute are going — so you can make decisions without hunting for data. + +## Features + +### 6 Core Monitoring Panels + +1. **💰 Cost Cards & Breakdown** — Today's cost, all-time cost, projected monthly, per-model cost breakdown with 7d/30d/all-time tabs. Know exactly which model is burning your budget. +2. **💚 System Health** — Gateway status (online/offline), PID, uptime, memory, compaction mode, CPU/RAM/swap/disk gauges. See at a glance whether your gateway is healthy. +3. **⏰ Cron Jobs** — All scheduled jobs with status, schedule, last/next run, duration, model. Spot failures instantly and see when the next fire is. +4. **📡 Active Sessions** — Recent sessions with model, type badges (DM/group/cron/subagent), context % bars, token counts. See who's consuming what. +5. **🤖 Sub-Agent Activity** — Sub-agent runs with cost, duration, status + token breakdown (7d/30d tabs). Know whether sub-agents are productive or spinning. +6. **📈 Cost Trends** — Cost trend line over 7d/30d, model cost breakdown bars, acceleration indicators. Catch spending spikes before they hurt. + +### Architecture + +- **Backend:** Python/FastAPI + PostgreSQL + Redis +- **Frontend:** Next.js 16 + React 19 + Tailwind CSS + shadcn/ui +- **Data collection:** Background gateway collector polling OpenClaw RPC endpoints +- **Real-time:** WebSocket endpoint for live agent events +- **Data processing:** Pure Python functions ported from Go dashboard logic (model name normalization, daily chart aggregation, alert computation, token formatting) + +### API Endpoints + +#### Monitoring Summary Endpoints + +| Endpoint | Method | Description | +|----------|--------|-------------| +| `/api/v1/monitoring/cost-summary` | GET | Today's cost, all-time cost, projected monthly | +| `/api/v1/monitoring/cost-breakdown` | GET | Per-model cost breakdown (7d/30d/all) | +| `/api/v1/monitoring/health-summary` | GET | Gateway status, system metrics, health gauges | +| `/api/v1/monitoring/cron-summary` | GET | Cron job statuses, schedules, run history | +| `/api/v1/monitoring/sessions-summary` | GET | Active sessions with model, context %, tokens | +| `/api/v1/monitoring/sub-agents-summary` | GET | Sub-agent runs with cost, duration, status | +| `/api/v1/monitoring/trends` | GET | Cost trends, model breakdown (7d/30d) | + +#### Monitoring CRUD Endpoints + +| Endpoint | Method | Description | +|----------|--------|-------------| +| `/api/v1/monitoring/cost-snapshots` | GET | Paginated cost snapshot records | +| `/api/v1/monitoring/cron-jobs` | GET | Paginated cron job status records | +| `/api/v1/monitoring/sessions` | GET | Paginated session event records | +| `/api/v1/monitoring/health` | GET | Paginated system health metrics | +| `/api/v1/monitoring/sub-agents` | GET | Paginated sub-agent run records | + +#### WebSocket + +| Endpoint | Description | +|----------|-------------| +| `/ws/agents` | Real-time agent events (initial snapshot + polling) | + +#### Gateway RPC Integration + +The collector service polls these OpenClaw gateway RPC endpoints: +- `usage.cost` + `usage.status` → Cost snapshots +- `cron.list` → Cron job status +- `sessions.list` + `sessions.preview` → Session events +- `health` + `status` → System health metrics + +## Quick Start + +### Docker Compose (Recommended) + +```bash +git clone https://forgejo/null/Mission-Control.git +cd Mission-Control +cp .env.example .env +docker compose up -d +``` + +The backend runs on port 8080, frontend on port 3037. + +### Environment Variables + +| Variable | Default | Description | +|----------|---------|-------------| +| `COLLECTION_INTERVAL_COST` | 300 | Seconds between cost collection | +| `COLLECTION_INTERVAL_CRON` | 60 | Seconds between cron collection | +| `COLLECTION_INTERVAL_SESSION` | 30 | Seconds between session collection | +| `COLLECTION_INTERVAL_HEALTH` | 60 | Seconds between health collection | +| `LOCAL_AUTH_TOKEN` | — | Token for local dev auth | +| `POSTGRES_HOST` | db | PostgreSQL host | +| `POSTGRES_PORT` | 5432 | PostgreSQL port | +| `POSTGRES_DB` | mission_control | Database name | +| `REDIS_URL` | redis://redis:6379/0 | Redis connection URL | + +## Project Structure + +``` +Mission-Control/ +├── src/ +│ ├── backend/ +│ │ ├── app/ +│ │ │ ├── api/ # API routes (monitoring, ws, gateways, etc.) +│ │ │ ├── models/ # SQLModel database models +│ │ │ ├── schemas/ # Pydantic request/response schemas +│ │ │ ├── services/ +│ │ │ │ └── monitoring/ +│ │ │ │ ├── gateway_collector.py # Background RPC collector +│ │ │ │ ├── data_processing.py # Dashboard data transforms +│ │ │ │ ├── event_parser.py # Session event parser +│ │ │ │ └── models.py # Pydantic RPC response models +│ │ │ └── main.py # FastAPI app + lifespan +│ │ ├── migrations/ # Alembic migrations +│ │ └── tests/ +│ └── frontend/ # Next.js app +│ └── src/ +│ ├── app/ # Next.js App Router pages +│ ├── components/ # React components +│ ├── api/ # Generated API clients +│ └── lib/ # Utilities +├── sources/ # Reference repos (Go, Node) +├── docker-compose.yml +├── Dockerfile +└── PROJECT.md # Full 4-phase implementation plan +``` + +## Data Collection Flow + +``` +OpenClaw Gateway + │ + │ RPC (usage.cost, cron.list, sessions.list, health, status) + ▼ +GatewayCollectorService (background asyncio task) + │ + │ Upsert into PostgreSQL + ▼ +Monitoring Models (CostSnapshot, CronJobStatus, SessionEvent, SubAgentRun, SystemHealthMetric) + │ + │ API endpoints + data_processing transforms + ▼ +Dashboard Frontend (Next.js) + │ + │ WebSocket for real-time events + ▼ +Live Agent Activity Panel +``` + +## Source Repos + +Mission Control ports functionality from two OpenClaw dashboard projects: + +- **[openclaw-dashboard](https://github.com/mudrii/openclaw-dashboard)** (Go) — Dashboard panels, data processing logic, alert computation +- **[openclaw-pixel-agents-dashboard](https://github.com/jaffer1979/openclaw-pixel-agents-dashboard)** (Node/Express) — Pixel agent visualization, session watching, event parsing + +**Key decision:** No new backend languages. Go and Node functionality ports to Python/FastAPI within Mission Control's backend. We reuse the gateway RPC transport and data model shapes, but port all processing/aggregation logic as pure Python functions. + +## Development + +```bash +# Backend +cd src/backend +pip install -r requirements.txt +uvicorn app.main:app --reload --port 8080 + +# Frontend +cd src/frontend +npm install +npm run dev +``` + +## License + +MIT \ No newline at end of file