⚡ Full Stack Overview

Everything. In Your Building.

The complete Sovereign stack — enterprise rack, inference cluster, agent workstations, and every access point your team needs. One system. Unlimited AI. Zero cloud.

The Full Sovereign Build

A sealed, locked, branded rack housing the complete inference and agent stack. Designed for office environments — quiet, compact, enterprise-grade.

Sovereign ATX Enterprise Rack — Mac Studio cluster, Mac Minis, Synology NAS

Sovereign Enterprise Rack · Mac Studio cluster · agent workstations · 27U locked enclosure

⚡ Sovereign Enterprise Rack · 27U
Patch Panel + Cable Mgmt
1U · organized entry/exit
10GbE Managed Switch
1U · Ubiquiti UniFi Pro 24
Inference Cluster (5U)
Mac Studio M5 Ultra · Node 1
512GB unified memory
Mac Studio M5 Ultra · Node 2
512GB unified memory
Mac Studio M5 Ultra · Node 3
512GB unified memory
Mac Studio M5 Ultra · Node 4
512GB unified memory
Agent Workstations (5U)
Mac Mini M4 Pro · ×3
1U tray · 24GB each
Mac Mini M4 Pro · ×3
1U tray · 24GB each
Mac Mini M4 Pro · ×3
1U tray · 24GB each
Mac Mini M4 Pro · ×3
1U tray · 24GB each
Mac Mini M4 Pro · ×3
1U tray · 24GB each
Mac Mini · Control Node
Cluster mgmt + monitoring
Synology NAS · 64TB
2U · models + data + logs
APC Smart-UPS 2200VA
2U · 20-min runtime
Fan Banks × 5
1U each · forced air cooling
Layer 1 — Inference Cluster

The brain. Four Mac Studio M5 Ultra nodes connected via Thunderbolt 5 full mesh, pooling 2TB of unified memory into a single inference fabric. Runs trillion-parameter models that no single-node machine can handle. All inference requests from every access point route here.

4× Mac Studio M5 Ultra 2TB pooled memory TB5 full mesh · 120 Gbps ~3µs inter-node latency 1T+ parameter models 6–8 parallel inference slots ~300W total draw
💻 Layer 2 — Agent Workstations (15× Mac Mini)

15 Mac Mini M4 Pro nodes — one per team member, department, or use case. Each runs its own OpenClaw instance with dedicated agents, memory, and integrations configured for that role. Heavy inference gets routed to the cluster automatically; lightweight tasks run locally. No contention, no waiting.

15× Mac Mini M4 Pro 24GB unified memory each OpenClaw per node Custom agents per role Routes to cluster for heavy tasks
🌐 Layer 3 — Network Infrastructure

Ubiquiti 10GbE managed switch for low-latency inter-node and LAN traffic. Cluster lives on an isolated VLAN — air-gapped from external networks. NAS provides persistent storage for model weights, client data, agent memory, and audit logs. UPS ensures clean shutdown during power events, never corrupting model state.

10GbE Ubiquiti switch Isolated VLAN 64TB Synology NAS APC 2200VA UPS
🔒 Layer 4 — Physical Security

Sealed, lockable rack enclosure. Two keyholders only — designated by the client. Active thermal management with 5-layer fan banks keeps hardware cool at continuous load. Sovereign does not hold a key. No physical access, no remote access after setup is complete. The hardware runs fully autonomously.

Sealed + keyed enclosure 2 keyholders max 5-layer active cooling No Sovereign key access Fully autonomous operation

Request to Response — Zero Cloud

Every query, regardless of access point, takes the same path — all inside your building.

Student / User
Any device, any access point
Mac Mini
Agent Node
Routes request
Inference
Cluster
4× M5 Ultra · 2TB
Response
Back to user · same path

Data never leaves this path. No cloud calls. No external APIs. No third-party servers.

Use it from anywhere you already work

No new app to install. No new workflow to learn. Sovereign meets your team wherever they are.

🌐
Web UI

Open WebUI on your local network. Any browser, any device on campus. Looks and feels like ChatGPT — zero learning curve.

sovereign.local / your IP
🔑
OpenAI-Compatible API

Drop-in replacement for any OpenAI integration. Same API format — swap the endpoint URL and your existing tools work instantly.

http://cluster/v1 + API key
💬
Discord

Bot in your Discord server. Students and staff message the AI directly in channels they already use. Custom commands per channel.

@SovereignBot
🤝
Slack

Socket Mode bot in your Slack workspace. Works in any channel or DM. Ideal for admin and faculty who live in Slack.

/ai in any channel
📱
Mobile App (PWA)

Open WebUI installs as a Progressive Web App from your browser. Feels like a native iOS/Android app, runs off the cluster.

Install from browser · no App Store
💻
VS Code / Cursor

API key in any Copilot-compatible extension. Developers and students get AI code completion and chat in their IDE — running locally.

OpenAI-compat endpoint
📝
Notion / Obsidian

API key in AI writing plugins. Research, summarization, and drafting happen inside your note-taking app, powered by the cluster.

API key in plugin settings
📞
Voice (Phone)

Call a dedicated number, speak naturally, get a response. Accessible without a device — useful for hands-free workflows.

Twilio → cluster
🎓
LMS Integration (Canvas / Google Classroom)

API key in Canvas external tools or Google Workspace add-ons. Students access AI without leaving their coursework environment.

LTI / API integration

What runs under the hood

All open-source. All auditable. Nothing phoning home.

OpenClaw
Agent Orchestration

Runs on every Mac Mini. Manages agents, memory, tool use, integrations, and all channel routing. The intelligence layer that makes models do real work.

llama-server
Primary Inference

llama.cpp-based inference server on the cluster. Handles all heavy model runs — the 397B and larger models route here via OpenAI-compatible API.

Ollama
Model Management

Manages secondary models on-demand. Handles smaller, faster models for quick tasks. Pulls and updates models automatically.

Open WebUI
Web Interface

ChatGPT-like web interface for anyone on the network. No install required. Supports multiple users, conversation history, and file uploads.

exo
Distributed Inference

Pools all 4 Mac Studios into a single inference fabric over Thunderbolt. Makes trillion-parameter models possible on consumer hardware.

Tailscale
Secure Tunnel (Setup Only)

Used once during initial configuration. Tunnel closes after setup. No ongoing remote access. Hardware operates fully autonomous after day one.

Ready to build

One conversation.
One afternoon to deploy.

We scope the build, deliver the hardware, configure everything, and hand you the keys. Your team is running private AI the same day.

Book a Call See Cluster Specs →