17 KiB
🎯 Mission Control — Project Plan
Merge three OpenClaw dashboards into a single, unified Mission Control platform.
Source Repos
| Repo | Purpose | Stack | Key Assets |
|---|---|---|---|
| abhi1693/openclaw-mission-control | Base platform — work orchestration, governance, gateway management | Python/FastAPI + PostgreSQL + Redis + Next.js (React 19) + Clerk auth + Docker Compose | Organizations, boards, tasks, tags, approvals, agents, gateways, webhooks, activity feed, skills marketplace |
| mudrii/openclaw-dashboard | Tracking layer — real-time metrics, costs, crons, sessions, system health | Go binary (zero deps) + embedded HTML/JS + SVG charts | Cost cards, cron status, session tracking, sub-agent activity, AI chat, system metrics (CPU/RAM/disk), 6 themes, alerts, token usage |
| jaffer1979/openclaw-pixel-agents-dashboard | Agent visualization — pixel-art agent sprites, real-time activity | Node/Express + Vite + React 19 + Canvas/WebSocket + JSONL parsing | Agent sprites with activity bubbles, conversation heat, spawn sub-agents, hardware monitor, service controls, day/night cycle |
Architecture Decision: What to Merge Into What
Base: openclaw-mission-control — this becomes the foundation because:
- It has the richest data model (organizations, boards, tasks, approvals, agents, gateways, webhooks)
- It has proper auth (Clerk or local bearer token)
- It has a full API layer (FastAPI with SQLModel/SQLAlchemy)
- It has multi-tenancy built in
- It has the most mature frontend (Next.js 16 + React 19 + TanStack Query + Recharts)
Merge FROM dashboard — extract the tracking/monitoring features:
- Cost tracking, token usage, model breakdown
- Cron job status, scheduling, last/next run
- Session tracking, sub-agent activity
- System health (CPU, RAM, disk, gateway status)
- AI chat panel (ask questions about your data)
- Alert system (high cost, failed crons, context usage)
- 6 themes + glass morphism UI
Merge FROM pixel-agents — extract the agent visualization:
- Pixel-art agent sprites in a shared office scene
- Real-time activity bubbles, conversation heat
- Sub-agent spawning from the UI
- Hardware monitor (CPU/GPU/RAM/disk/network)
- Service controls (start/stop/restart gateway)
- Day/night cycle ambient lighting
Technical Analysis
Base Platform (openclaw-mission-control)
Backend:
- Python 3.12+, FastAPI, SQLModel/SQLAlchemy, PostgreSQL, Redis
- Alembic migrations, RQ worker for webhooks
- Full OpenClaw gateway integration via WebSocket RPC (device pairing, control UI)
- Gateway methods: 60+ RPC calls for sessions, agents, cron, config, exec approvals, etc.
- Auth: Clerk JWT or local bearer token (≥50 chars)
Frontend:
- Next.js 16.1.7, React 19.2, TanStack Query v5, TanStack Table v8
- Radix UI primitives, Tailwind CSS, Recharts, React Markdown
- 40+ page routes (dashboard, boards, agents, approvals, gateways, skills, tags, etc.)
- Cypress E2E tests
Data Model (27 tables):
- Organizations, users, boards, board_groups, tasks, tags, approvals
- Agents, gateways, activity_events, board_webhooks, skills
- Custom fields, task dependencies, task fingerprints
- Board memory, board group memory, onboarding
What it LACKS that the others have:
- No real-time cost/token tracking
- No system health monitoring (CPU/RAM/disk)
- No cron job visualization
- No session/sub-agent activity monitoring
- No AI chat for asking about your deployment
- No pixel-art agent visualization
- No hardware monitoring
- No service controls (start/stop/restart gateway)
Dashboard (openclaw-dashboard) — What We Pull
Data Collection (Go):
refresh.go— main collector, reads OpenClaw filesystem + gateway APIrefresh_sessions.go— session listing, model resolutionrefresh_tokens.go— token usage trackingcron_state— cron job parsing and statussystem.go— CPU, RAM, swap, disk, gateway runtime probes
API Endpoints:
/api/refresh— stale-while-revalidate data.json/api/chat— AI chat via OpenClaw gateway/api/system— live host metrics/api/logs— merged log tail/api/errors— aggregated error feed
Frontend:
- Pure HTML/CSS/JS (single
index.html) — we'll rewrite as React components - State management: 7 plain objects (State, DataLayer, DirtyChecker, Renderer, Theme, Chat, App)
- SVG chart rendering (cost trends, model breakdown, sub-agent activity)
- 6 themes with 19 CSS color variables each
Integration Approach:
- Port the Go data collection to Python services that hit the OpenClaw gateway API
- Replace the embedded HTML frontend with React components in the Next.js app
- Use the existing gateway RPC connection in Mission Control's backend
- Add PostgreSQL models for tracking data (cost snapshots, cron states, session events)
Pixel Agents (openclaw-pixel-agents-dashboard) — What We Pull
Backend (Node/Express):
sessionWatcher.ts— tails JSONL session files, parses eventsspawner.ts— spawns sub-agents via gateway APIservices.ts— gateway service controls (start/stop/restart)hardware.ts— hardware stats collectionopenclawParser.ts— JSONL event parsing- WebSocket broadcasting to frontend
Frontend (React/Vite):
- Pixel-art canvas renderer (
OfficeCanvas.tsx, game loop, character sprites) - Activity bubbles, conversation heat overlays
- Spawn chat panel, session info panel
- Server rack (hardware monitor), breaker panel (service controls)
- Ham radio (update checker), fire alarm (gateway restart)
Integration Approach:
- Port JSONL session watcher to Python (watch OpenClaw session directory)
- Move sub-agent spawning to use Mission Control's existing gateway RPC
- Rebuild the pixel-art canvas as a React component within Next.js
- Add WebSocket support to FastAPI for real-time agent events
- Hardware stats collected via the gateway's
healthandstatusmethods
Implementation Plan
Phase 1: Foundation Setup (Week 1)
1.1 — Fork and Stand Up Base
- Fork
abhi1693/openclaw-mission-controlto our org - Stand up local dev environment (Docker Compose: Postgres + Redis + backend + frontend)
- Verify all existing features work: auth, boards, tasks, agents, gateways, approvals
- Document the data model and API surface
1.2 — Add Tracking Models (Backend)
- Create new PostgreSQL models:
CostSnapshot— daily cost tracking per model/gatewayCronJobStatus— cron schedule, last/next run, duration, statusSessionEvent— session start/stop, model, tokens, context %SubAgentRun— sub-agent spawn, cost, duration, statusSystemHealthMetric— CPU, RAM, disk, swap, gateway uptimeAlertRule— configurable alert thresholds
- Create Alembic migration
- Add CRUD API endpoints under
/api/monitoring/
1.3 — Gateway Data Collection Service
- Create
app/services/monitoring/gateway_collector.py - Reuse existing
gateway_rpc.pyto poll:usage.cost— cost datausage.status— token countscron.list/cron.status— cron jobssessions.list/sessions.preview— sessionsagents.list— agentshealth— gateway healthstatus— gateway runtime status
- Run as background task (asyncio) with configurable intervals
- Store collected data in the new models
Phase 2: Tracking Dashboard (Week 2)
2.1 — Monitoring Pages (Frontend)
- New Next.js routes:
/monitoring— main dashboard (cost cards, system health, alerts)/monitoring/costs— detailed cost breakdown with charts/monitoring/sessions— active sessions, sub-agent activity/monitoring/crons— cron job management/monitoring/system— CPU/RAM/disk/gateway health
2.2 — Cost Tracking UI
- Port dashboard's cost cards and donut chart to React/Recharts
- Today's cost, all-time cost, projected monthly
- Per-model cost breakdown (7d/30d/all-time tabs)
- Cost trend line chart (SVG → Recharts)
2.3 — Session & Sub-Agent UI
- Active sessions with model, type badges (DM/group/cron/subagent)
- Context % bars, token counts
- Sub-agent activity grid with cost/duration/status
- Session detail panel with conversation preview
2.4 — Cron Job Management
- Cron job list with schedule, status, last/next run
- Run history with duration and status badges
- Trigger manual run from UI
- Add/edit/delete cron jobs (using existing gateway RPC)
2.5 — System Health
- Gateway status card (uptime, PID, memory, compaction)
- CPU/RAM/swap/disk gauge cards (configurable thresholds)
- Alert banner for high cost, failed crons, gateway offline
- Auto-refresh with countdown timer
2.6 — AI Chat Panel
- Port dashboard's AI chat to React component
- Uses OpenClaw gateway's
/v1/chat/completionsendpoint - Context-aware: feed live monitoring data into system prompt
- Persistent chat history per user
Phase 3: Agent Visualization (Week 3)
3.1 — Pixel Agent Canvas
- Port the pixel-art office scene to React (Canvas component)
- Agent sprites with activity state (working, idle, talking)
- Activity bubbles showing current task/conversation
- Conversation heat glow based on recent activity
- Day/night ambient cycle
- Pan/zoom controls (touch + mouse)
3.2 — Real-Time Agent Events
- Add FastAPI WebSocket endpoint (
/ws/agents) - Port JSONL session watcher to Python:
- Watch
~/.openclaw/agents/*/sessions/*.jsonl - Parse events (tool calls, responses, status changes)
- Broadcast to connected WebSocket clients
- Watch
- Activity ticker component (recent agent actions scrolling by)
3.3 — Sub-Agent Spawner
- Spawn panel integrated into the canvas view
- Click agent → "Spawn sub-agent" button
- Mini-chat for tasking the sub-agent
- Session info panel for active sub-agents
- Uses existing
agents.creategateway RPC
3.4 — Hardware Monitor & Service Controls
- Server rack component (CPU/GPU/RAM/disk/network gauges)
- Breaker panel for gateway start/stop/restart
- Ham radio component for OpenClaw update checking
- All using existing gateway RPC methods (
health,status,update.run)
Phase 4: Integration & Polish (Week 4)
4.1 — Navigation Integration
- Add "Monitoring" and "Agents" sections to Mission Control sidebar
- Dashboard home page shows summary cards (cost, health, agent count)
- Deep links from monitoring → agents → pixel view
4.2 — Theme System
- Port the 6 dashboard themes into Mission Control's Tailwind config
- Theme picker in header (persists via localStorage)
- Glass morphism effects where appropriate
4.3 — Alert System
- Configurable alert rules (cost threshold, cron failure, context %, memory)
- Alert banner on every page when active
- Alert history in activity feed
- Notification delivery via webhooks or in-app
4.4 — Data Sync Strategy
- Primary: Gateway RPC polling (configurable intervals)
- Secondary: JSONL file watching for real-time agent events
- Tertiary: REST API for manual refresh
- WebSocket push for live updates to connected browsers
- Stale-while-revalidate caching pattern
File Structure (Additions to Mission Control)
backend/
├── app/
│ ├── models/
│ │ ├── monitoring.py # CostSnapshot, CronJobStatus, SessionEvent, etc.
│ │ └── alert_rules.py # AlertRule model
│ ├── api/
│ │ ├── monitoring.py # Cost, session, cron endpoints
│ │ ├── monitoring_system.py # System health endpoints
│ │ └── agent_events.py # WebSocket endpoint for agent events
│ └── services/
│ ├── monitoring/
│ │ ├── gateway_collector.py # Polls OpenClaw gateway for data
│ │ ├── jsonl_watcher.py # Watches session JSONL files
│ │ ├── cost_tracker.py # Cost aggregation and projection
│ │ └── alert_engine.py # Alert rule evaluation
│ └── openclaw/
│ └── (existing — no changes needed)
├── migrations/
│ └── versions/
│ └── xxx_add_monitoring_models.py
frontend/
├── src/
│ ├── app/
│ │ ├── monitoring/
│ │ │ ├── page.tsx # Main monitoring dashboard
│ │ │ ├── costs/page.tsx # Cost detail page
│ │ │ ├── sessions/page.tsx # Session detail page
│ │ │ ├── crons/page.tsx # Cron management page
│ │ │ └── system/page.tsx # System health page
│ │ └── agents/
│ │ └── pixel/page.tsx # Pixel agent canvas page
│ ├── components/
│ │ ├── monitoring/
│ │ │ ├── CostCards.tsx
│ │ │ ├── CostTrendChart.tsx
│ │ │ ├── ModelBreakdownChart.tsx
│ │ │ ├── SessionTable.tsx
│ │ │ ├── SubAgentActivity.tsx
│ │ │ ├── CronJobList.tsx
│ │ │ ├── SystemHealthCards.tsx
│ │ │ ├── AlertBanner.tsx
│ │ │ └── AiChatPanel.tsx
│ │ ├── agents/
│ │ │ ├── PixelCanvas.tsx
│ │ │ ├── AgentSprite.tsx
│ │ │ ├── ActivityBubble.tsx
│ │ │ ├── ConversationHeat.tsx
│ │ │ ├── SpawnPanel.tsx
│ │ │ ├── ServerRack.tsx
│ │ │ └── BreakerPanel.tsx
│ │ └── (existing Mission Control components)
│ └── lib/
│ ├── monitoring-api.ts # API client for monitoring endpoints
│ └── agent-events.ts # WebSocket client for agent events
Key Integration Points
Gateway Communication
All three projects talk to the OpenClaw gateway. Mission Control already has the richest integration (gateway_rpc.py with 60+ methods). We reuse this for everything:
| Feature | Gateway Methods Used |
|---|---|
| Cost tracking | usage.cost, usage.status |
| Session monitoring | sessions.list, sessions.preview |
| Cron management | cron.list, cron.status, cron.add, cron.update, cron.remove, cron.run |
| Agent management | agents.list, agents.create, agents.update, agents.delete |
| System health | health, status, logs.tail |
| Sub-agent spawning | agents.create, sessions.patch |
| Service controls | config.get, config.set, update.run |
Real-Time Updates
- Dashboard uses polling (60s auto-refresh)
- Pixel agents uses WebSocket (real-time JSONL events)
- Mission Control uses TanStack Query (polling + cache invalidation)
Our approach: WebSocket for agent events (real-time pixel animation), TanStack Query with 30s polling for monitoring data, SSE for alerts.
Auth
- Mission Control supports Clerk JWT and local bearer token
- Dashboard is auth-free (localhost only)
- Pixel agents uses gateway token
Our approach: Inherit Mission Control's auth system. Local mode for self-hosted, Clerk for multi-tenant. Monitoring and agent data scoped to organization + gateway.
Dependency Summary
| Layer | Technology | Source |
|---|---|---|
| Backend framework | FastAPI + SQLModel | Mission Control |
| Database | PostgreSQL + Alembic | Mission Control |
| Job queue | Redis + RQ | Mission Control |
| Frontend framework | Next.js 16 + React 19 | Mission Control |
| UI primitives | Radix UI + Tailwind | Mission Control |
| Charts | Recharts (existing) | Mission Control |
| Pixel canvas | HTML5 Canvas (new) | Pixel Agents → React port |
| WebSocket | FastAPI WebSocket (new) | Pixel Agents → Python port |
| Auth | Clerk / local bearer token | Mission Control |
| Gateway RPC | websockets Python (existing) | Mission Control |
No new backend languages. Go and Node/Express are NOT added — their functionality ports to Python services within the existing FastAPI app.
Risk Assessment
| Risk | Impact | Mitigation |
|---|---|---|
| Canvas rendering performance in React | Medium | Use useRef + requestAnimationFrame, not React state for animation |
| Go dashboard data collection rewritten in Python | Medium | Port logic faithfully; test against same OpenClaw data |
| JSONL file watching reliability | Medium | Use watchdog library + fallback polling |
| Theme system merge (6 themes × 2 systems) | Low | Map dashboard's 19 CSS vars to Tailwind config |
| Pixel assets licensing | Low | MIT licensed, attribution in ASSET-LICENSE.md |
| Gateway RPC version compatibility | Low | Already handled by protocol version negotiation in gateway_rpc.py |
Success Metrics
- All monitoring features from dashboard available in Mission Control UI
- Pixel agent visualization showing real-time agent activity
- Single Docker Compose brings up the entire system
- Single auth system — no separate logins
- Single gateway connection — reused across all features
- No Go or Node backend — everything in Python/FastAPI
- All existing Mission Control features still work (boards, tasks, approvals, etc.)