The complete Sovereign stack — enterprise rack, inference cluster, agent workstations, and every access point your team needs. One system. Unlimited AI. Zero cloud.
A sealed, locked, branded rack housing the complete inference and agent stack. Designed for office environments — quiet, compact, enterprise-grade.
Sovereign Enterprise Rack · Mac Studio cluster · agent workstations · 27U locked enclosure
The brain. Four Mac Studio M5 Ultra nodes connected via Thunderbolt 5 full mesh, pooling 2TB of unified memory into a single inference fabric. Runs trillion-parameter models that no single-node machine can handle. All inference requests from every access point route here.
15 Mac Mini M4 Pro nodes — one per team member, department, or use case. Each runs its own OpenClaw instance with dedicated agents, memory, and integrations configured for that role. Heavy inference gets routed to the cluster automatically; lightweight tasks run locally. No contention, no waiting.
Ubiquiti 10GbE managed switch for low-latency inter-node and LAN traffic. Cluster lives on an isolated VLAN — air-gapped from external networks. NAS provides persistent storage for model weights, client data, agent memory, and audit logs. UPS ensures clean shutdown during power events, never corrupting model state.
Sealed, lockable rack enclosure. Two keyholders only — designated by the client. Active thermal management with 5-layer fan banks keeps hardware cool at continuous load. Sovereign does not hold a key. No physical access, no remote access after setup is complete. The hardware runs fully autonomously.
Every query, regardless of access point, takes the same path — all inside your building.
Data never leaves this path. No cloud calls. No external APIs. No third-party servers.
No new app to install. No new workflow to learn. Sovereign meets your team wherever they are.
Open WebUI on your local network. Any browser, any device on campus. Looks and feels like ChatGPT — zero learning curve.
sovereign.local / your IPDrop-in replacement for any OpenAI integration. Same API format — swap the endpoint URL and your existing tools work instantly.
http://cluster/v1 + API keyBot in your Discord server. Students and staff message the AI directly in channels they already use. Custom commands per channel.
@SovereignBotSocket Mode bot in your Slack workspace. Works in any channel or DM. Ideal for admin and faculty who live in Slack.
/ai in any channelOpen WebUI installs as a Progressive Web App from your browser. Feels like a native iOS/Android app, runs off the cluster.
Install from browser · no App StoreAPI key in any Copilot-compatible extension. Developers and students get AI code completion and chat in their IDE — running locally.
OpenAI-compat endpointAPI key in AI writing plugins. Research, summarization, and drafting happen inside your note-taking app, powered by the cluster.
API key in plugin settingsCall a dedicated number, speak naturally, get a response. Accessible without a device — useful for hands-free workflows.
Twilio → clusterAPI key in Canvas external tools or Google Workspace add-ons. Students access AI without leaving their coursework environment.
LTI / API integrationAll open-source. All auditable. Nothing phoning home.
Runs on every Mac Mini. Manages agents, memory, tool use, integrations, and all channel routing. The intelligence layer that makes models do real work.
llama.cpp-based inference server on the cluster. Handles all heavy model runs — the 397B and larger models route here via OpenAI-compatible API.
Manages secondary models on-demand. Handles smaller, faster models for quick tasks. Pulls and updates models automatically.
ChatGPT-like web interface for anyone on the network. No install required. Supports multiple users, conversation history, and file uploads.
Pools all 4 Mac Studios into a single inference fabric over Thunderbolt. Makes trillion-parameter models possible on consumer hardware.
Used once during initial configuration. Tunnel closes after setup. No ongoing remote access. Hardware operates fully autonomous after day one.
We scope the build, deliver the hardware, configure everything, and hand you the keys. Your team is running private AI the same day.