Architecture

Name: Kantai OSS
Brand: EOSE

How Kantai OSS is built — Kubernetes-native, zero-trust, GitOps-driven, and designed for self-hosted privacy.

Platform Overview

Kantai runs entirely within your Kubernetes cluster. No external dependencies, no phone-home, no vendor lock-in.

Core Components

Gangway — Web portal (Dashboard, Chat, Nava, Learn, Wiki, Governance, Tetraban)
OpenClaw Runtime — Agent execution engine (workspaces, memory, tool dispatch)
Agent Pods — One pod per agent (Sencho, Takumi, Bannin, custom)
PostgreSQL — Conversation history, memory, configuration
Redis — Session cache, pub/sub for inter-agent messaging

Deployment Model

Deploys to any Kubernetes cluster (AKS, EKS, GKE, k3s, bare metal)
Single namespace (kantai) with all resources
Helm chart for install, upgrade, and configuration
Flux GitOps compatible — store values in Git, Flux reconciles
Rolling updates with zero-downtime agent restarts

Security Model

Zero-trust by default. Every component authenticates, every request is authorized, every action is logged.

Namespace Isolation

All Kantai resources live in a dedicated namespace. Network policies restrict ingress/egress to only what’s needed. Agents cannot reach other namespaces or the host network.

Network Policies

Default-deny ingress. Gangway accepts traffic on 443. Agents communicate only via internal services. Egress is allowed to configured LLM API endpoints and messaging providers — nothing else.

RBAC

Each agent pod runs with a dedicated service account. Permissions are scoped to the minimum required. No cluster-admin, no wildcard rules.

Pod Security

Non-root containers with read-only root filesystems
Security contexts enforce runAsNonRoot, allowPrivilegeEscalation: false
Resource limits prevent runaway processes
Pod disruption budgets for availability during upgrades

Secret Management

Secrets never live in plain text. Kantai integrates with external secret stores for production-grade key management.

Azure Key Vault (Recommended)

LLM API keys, database credentials, and channel tokens stored in Key Vault
Mounted into pods via the CSI Secrets Store Driver
Automatic rotation — pods pick up new secrets without restart
Audit logging via Azure Monitor

Kubernetes Secrets (Default)

Works out of the box, no external dependencies
Secrets encrypted at rest (enable EncryptionConfiguration on your cluster)
Suitable for development and small deployments
Upgrade path to Key Vault when ready

Backup Strategy

Your data is valuable. Kantai includes a backup strategy that covers all persistent state.

What’s Backed Up

PostgreSQL — pg_dump on schedule, stored in blob storage
Agent workspaces — PVC snapshots via volume snapshots or rsync to blob
Configuration — Helm values in Git (GitOps), Governance settings in DB
Nava knowledge — included in PostgreSQL backups

Recovery

RPO: 1 hour (configurable, down to 5 minutes)
RTO: 15 minutes for full restore from latest backup
Recovery runbook included in the Helm chart docs
Azure Recovery Services Vault (RSV) integration for managed AKS clusters
Cross-region backup support for disaster recovery

Identity

Kantai uses modern identity patterns — no shared secrets, no static credentials where avoidable.

Workload Identity

Agent pods authenticate to cloud services using Kubernetes workload identity. No API keys for Azure/AWS/GCP services — the platform handles token exchange.

Managed Identity

For AKS deployments, managed identity provides passwordless access to Key Vault, blob storage, and other Azure resources. Zero credential management.

OIDC

Gangway supports OIDC for user authentication. Connect to your identity provider (Entra ID, Okta, Keycloak, etc.) for SSO and MFA.

Observability

You can’t manage what you can’t see. Kantai exposes metrics, logs, and traces for full-stack visibility.

Metrics

Prometheus-format metrics from every component. Agent token usage, request latency, task completion rates, error counts. Grafana dashboards included in the Helm chart.

Logging

Structured JSON logs from all pods. Compatible with any log aggregator (Loki, ELK, Fluentd, Azure Monitor). Log levels configurable per component.

Alerting

Pre-configured alert rules for: agent down, high error rate, token budget exceeded, backup failure, certificate expiry. Sends to PagerDuty, Slack, email, or webhooks.

Multi-Tenancy

Run multiple isolated fleets on the same cluster.

Namespace per tenant — each fleet gets its own namespace with full isolation
Network policies — tenants cannot communicate unless explicitly allowed
Resource quotas — enforce CPU, memory, and storage limits per tenant
Separate databases — each tenant gets its own PostgreSQL instance or schema
RBAC — tenant admins can only see and manage their own fleet

Multi-tenancy is ideal for organizations running Kantai for multiple teams, or for service providers offering managed Kantai fleets.

Want this managed for you? pemos.ca runs Kantai on hardened AKS infrastructure with automated backups, monitoring, and support — so you don’t have to manage the platform yourself.

Master Control — Enterprise Pattern

Master Control is the enterprise deployment pattern built on Kantai OSS. It extends the core platform with dedicated principals — specialized command roles that oversee crews of agents.

How OSS Feeds Master Control

Gangway portal becomes the bridge UI for each principal
Agent pods are organized into crews (Security crew, Solutions crew)
Governance controls power compliance scoring and OCO policies
Tetraban serves as the enterprise kill switch across all crews
OIDC & identity enable principal-level access control

Two Principals

🎖️ Security Principal (CISO’s Bridge) — Officer, Watchkeeper, Quartermaster
🔧 Solutions Principal (CTO’s Bridge) — Captain, Bosun, Helmsman, Botwright
📡 Signals — Cross-principal communication and alerting

Each principal commands a dedicated crew with scoped permissions, dashboards, and workflows — all built on the same Kantai components you self-host today.

Master Control managed service → pemos.ca/master-control — enterprise-grade, fully operated by EOSE.