Platform Overview
Kantai runs entirely within your Kubernetes cluster. No external dependencies, no phone-home, no vendor lock-in.
Core Components
- Gangway — Web portal (Dashboard, Chat, Nava, Learn, Wiki, Governance, Tetraban)
- OpenClaw Runtime — Agent execution engine (workspaces, memory, tool dispatch)
- Agent Pods — One pod per agent (Sencho, Takumi, Bannin, custom)
- PostgreSQL — Conversation history, memory, configuration
- Redis — Session cache, pub/sub for inter-agent messaging
Deployment Model
- Deploys to any Kubernetes cluster (AKS, EKS, GKE, k3s, bare metal)
- Single namespace (
kantai) with all resources - Helm chart for install, upgrade, and configuration
- Flux GitOps compatible — store values in Git, Flux reconciles
- Rolling updates with zero-downtime agent restarts
Security Model
Zero-trust by default. Every component authenticates, every request is authorized, every action is logged.
Namespace Isolation
All Kantai resources live in a dedicated namespace. Network policies restrict ingress/egress to only what’s needed. Agents cannot reach other namespaces or the host network.
Network Policies
Default-deny ingress. Gangway accepts traffic on 443. Agents communicate only via internal services. Egress is allowed to configured LLM API endpoints and messaging providers — nothing else.
RBAC
Each agent pod runs with a dedicated service account. Permissions are scoped to the minimum required. No cluster-admin, no wildcard rules.
Pod Security
- Non-root containers with read-only root filesystems
- Security contexts enforce
runAsNonRoot,allowPrivilegeEscalation: false - Resource limits prevent runaway processes
- Pod disruption budgets for availability during upgrades
Secret Management
Secrets never live in plain text. Kantai integrates with external secret stores for production-grade key management.
Azure Key Vault (Recommended)
- LLM API keys, database credentials, and channel tokens stored in Key Vault
- Mounted into pods via the CSI Secrets Store Driver
- Automatic rotation — pods pick up new secrets without restart
- Audit logging via Azure Monitor
Kubernetes Secrets (Default)
- Works out of the box, no external dependencies
- Secrets encrypted at rest (enable
EncryptionConfigurationon your cluster) - Suitable for development and small deployments
- Upgrade path to Key Vault when ready
Backup Strategy
Your data is valuable. Kantai includes a backup strategy that covers all persistent state.
What’s Backed Up
- PostgreSQL — pg_dump on schedule, stored in blob storage
- Agent workspaces — PVC snapshots via volume snapshots or rsync to blob
- Configuration — Helm values in Git (GitOps), Governance settings in DB
- Nava knowledge — included in PostgreSQL backups
Recovery
- RPO: 1 hour (configurable, down to 5 minutes)
- RTO: 15 minutes for full restore from latest backup
- Recovery runbook included in the Helm chart docs
- Azure Recovery Services Vault (RSV) integration for managed AKS clusters
- Cross-region backup support for disaster recovery
Identity
Kantai uses modern identity patterns — no shared secrets, no static credentials where avoidable.
Workload Identity
Agent pods authenticate to cloud services using Kubernetes workload identity. No API keys for Azure/AWS/GCP services — the platform handles token exchange.
Managed Identity
For AKS deployments, managed identity provides passwordless access to Key Vault, blob storage, and other Azure resources. Zero credential management.
OIDC
Gangway supports OIDC for user authentication. Connect to your identity provider (Entra ID, Okta, Keycloak, etc.) for SSO and MFA.
Observability
You can’t manage what you can’t see. Kantai exposes metrics, logs, and traces for full-stack visibility.
Metrics
Prometheus-format metrics from every component. Agent token usage, request latency, task completion rates, error counts. Grafana dashboards included in the Helm chart.
Logging
Structured JSON logs from all pods. Compatible with any log aggregator (Loki, ELK, Fluentd, Azure Monitor). Log levels configurable per component.
Alerting
Pre-configured alert rules for: agent down, high error rate, token budget exceeded, backup failure, certificate expiry. Sends to PagerDuty, Slack, email, or webhooks.
Multi-Tenancy
Run multiple isolated fleets on the same cluster.
- Namespace per tenant — each fleet gets its own namespace with full isolation
- Network policies — tenants cannot communicate unless explicitly allowed
- Resource quotas — enforce CPU, memory, and storage limits per tenant
- Separate databases — each tenant gets its own PostgreSQL instance or schema
- RBAC — tenant admins can only see and manage their own fleet
Multi-tenancy is ideal for organizations running Kantai for multiple teams, or for service providers offering managed Kantai fleets.
Want this managed for you? pemos.ca runs Kantai on hardened AKS infrastructure with automated backups, monitoring, and support — so you don’t have to manage the platform yourself.
Master Control — Enterprise Pattern
Master Control is the enterprise deployment pattern built on Kantai OSS. It extends the core platform with dedicated principals — specialized command roles that oversee crews of agents.
How OSS Feeds Master Control
- Gangway portal becomes the bridge UI for each principal
- Agent pods are organized into crews (Security crew, Solutions crew)
- Governance controls power compliance scoring and OCO policies
- Tetraban serves as the enterprise kill switch across all crews
- OIDC & identity enable principal-level access control
Two Principals
- 🎖️ Security Principal (CISO’s Bridge) — Officer, Watchkeeper, Quartermaster
- 🔧 Solutions Principal (CTO’s Bridge) — Captain, Bosun, Helmsman, Botwright
- 📡 Signals — Cross-principal communication and alerting
Each principal commands a dedicated crew with scoped permissions, dashboards, and workflows — all built on the same Kantai components you self-host today.
Master Control managed service → pemos.ca/master-control — enterprise-grade, fully operated by EOSE.