GPUs Kubernetes BareMetal Platform Engineering kMetal Kamaji Hosted Control Plane

Introducing kMetal: From Hosted Control Planes to Bare Metal

How we went from building Kamaji, the open-source hosted control plane engine, to kMetal, a full Kubernetes platform for bare metal with embedded isolation and fleet management. Part 1.

Wednesday, March 11, 2026 Adriano Pezzuto

We started Clastix with a narrow observation: every Kubernetes cluster wastes three machines on its control plane.

An organization running 30 tenant clusters dedicates 90 machines to API servers and etcd. These machines consume power, rack space, and cooling while running zero workloads. Behind that machine count: 30 independent etcd clusters to back up, 30 sets of TLS certificates to rotate, 30 upgrade cycles to plan with maintenance windows. The operational surface grows linearly with the number of tenants, and none of it serves actual applications.

We built Kamaji to fix this. Kamaji is an open-source hosted control plane engine. It runs tenant control planes as pods on a small set of shared management nodes instead of on dedicated machines. We released it under Apache 2.0, and it works. A 30-tenant deployment that traditionally needs 90 control plane machines runs on 3 shared nodes. Each tenant still gets a real, CNCF compliant Kubernetes cluster with a dedicated etcd instance, full cluster-admin access, and an independent lifecycle. From the tenant's perspective, nothing changes. From the operator's perspective, 87 machines are freed for actual work.

NVIDIA, OVH, Mistral, Rackspace, and others adopted Kamaji for production workloads, running hundreds and in some cases thousands of tenant clusters. But the more we worked with these deployments, the more we understood that hosted control planes, while they solve the biggest single cost, don't solve the whole problem for organizations running Kubernetes.

This post explains what we learned and why we built kMetal. More technical details in Part 2.

The three costs of multi-tenant Kubernetes

When we looked at how organizations actually run multi-tenant Kubernetes, we kept seeing the same three costs stack on top of each other.

1. The control plane tax

This is the one Kamaji addresses directly. Every tenant cluster needs a highly available control plane: three machines for etcd quorum. At scale, these machines represent an important portion of the total fleet, and they do nothing but run infrastructure. The operational burden is proportional: each independent control plane has its own backup schedule, its own certificate rotation, its own upgrade cycle, its own failure domain.

Kamaji eliminates this by hosting all tenant control planes as pods on shared infrastructure. Provisioning drops from hours to about 20 seconds. Upgrades happen via blue/green pod replacement in about 16 seconds with zero downtime. The etcd instance is still dedicated per tenant (data isolation is maintained), but the machines are shared.

This part is open source and works on any infrastructure thanks to Cluster API: bare metal, VMs, cloud, edge. Kamaji doesn't care what's underneath. See full list of supported providers.

2. The hypervisor burden

The control plane tax is expensive, but it's the second cost that really defines the problem of bare metal.

Multi-tenant Kubernetes on bare metal needs tenant isolation. Containers sharing a kernel aren't enough. A container escape compromises every tenant on the host. And network isolation through Kubernetes network policies alone isn't sufficient for hard multi-tenancy requirements. Organizations need both compute separation and network segregation.

The standard solution: install a hypervisor. VMware ESXi, Nutanix AHV, OpenStack, or Proxmox. Spin up VMs for each tenant. Install Kubernetes inside those VMs. Then add a network isolation solution on top: VMware NSX, Nutanix Flow, or manual VLAN stitching.

This works. But here's what you end up running:

A full compute management layer (ESXi, AHV, OpenStack, Proxmox) with its own lifecycle, its own patching, its own monitoring
A separate network isolation layer (NSX, Flow, OVN) with its own management console and its own operational model
A Kubernetes layer on top of both
Three consoles: vCenter (or Prism, or Horizon, or Proxmox UI) + a network management UI + kubectl
Three skill sets: hypervisor administrators, network administrators, and Kubernetes operators
A multi-stage provisioning workflow: create the VM, configure the network, then bootstrap Kubernetes

None of these products was the goal. Isolated Kubernetes clusters were. The hypervisor stack is infrastructure overhead, installed, licensed, operated, and maintained solely as an isolation mechanism for Kubernetes tenants.

One more thing: the hypervisor doesn't solve the control plane tax at all. You still need three dedicated VMs per tenant for the control plane. The hypervisor adds isolation but doesn't reduce the machine count.

3. Fleet management chaos

The third cost gets worse as the fleet grows. More tenants mean more clusters. More clusters, more tenants mean more operational diversity.

Without unified fleet management, every cluster is a snowflake. Each site has its own provisioning workflow. Upgrades happen cluster-by-cluster. Policies are applied manually and drift over time. There's no single pane to see what's running where, at what version, in what state. Platform teams manage clusters one by one, using different tools in different environments.

This is manageable at 3 clusters. It stops scaling around 5.

Kamaji helps with the control plane dimension. All tenant CPs are pods on the management cluster, so you can manage them uniformly. But fleet-wide lifecycle automation, governance, policy enforcement, drift remediation, self-service provisioning: these are platform capabilities that sit outside the scope of a hosted control plane engine.

Why not just use physical servers?

Before explaining how kMetal addresses isolation, there's a question we get in every architecture conversation. Why not run tenant workers directly on physical servers? We ended up with these observations:

Modern physical servers are too big. Modern bare metal servers are dense: terabytes of RAM, 64+ cores. But Kubernetes scales horizontally with many smaller nodes, not vertically with a few large ones. A single physical server is far more capacity than a typical worker node needs. Right-sizing requires slicing the physical server into multiple workers.
Isolated clusters on dedicated hardware don't scale well. If every new tenant cluster requires separate physical servers, cluster provisioning is bound by hardware procurement: purchase orders, rack time, cabling, OS provisioning. Weeks or months, not seconds. That model collapses when tenants need clusters on demand.

Using KVM-isolated virtual machines decouples tenant isolation from physical hardware boundaries. You can provision right-sized, isolated worker nodes in seconds on shared bare metal infrastructure. The hypervisor layer is the correct architectural choice. Virtualization for isolation makes sense. A separate, full-stack virtualization product just for that purpose doesn't.

What kMetal does

Clastix kMetal is the full Kubernetes platform for bare metal. It addresses all the three costs in one product.

Hosted control planes. The same Kamaji architecture, extended for fleet-scale operations. All tenant control planes run as pods on shared management nodes. Dedicated etcd per tenant. Near-zero per-cluster overhead.

Embedded isolation. Both compute and network isolation are built into the platform, not as separate products:

Compute isolation uses kernel-native Linux KVM. Each tenant's worker nodes run in their own virtual machines with dedicated kernels. A compromise in one tenant's workload doesn't affect others. They have separate kernels and separate attack surfaces. This is the same isolation boundary that traditional hypervisors provide, but it's embedded in the platform and managed through the Kubernetes API.

Network isolation uses OVN, a software-defined networking layer that gives each tenant a fully isolated network domain. Tenant network traffic is segregated at the platform level. No VLAN stitching, no external firewalls, no manual per-tenant network configuration. The network isolation layer is part of the platform, not a separate product with its own management console.

The key difference from the traditional approach: there is no separate hypervisor console. No vCenter. No Prism. No NSX Manager. Operators work with `kubectl`, GitOps, and Kubernetes concepts only. Both the compute isolation and the network isolation are managed through the Kubernetes API. They're invisible in day-to-day operations.

Fleet management. The operational layer that makes this an enterprise platform:

Cluster and node profiles define standard configurations: OS image, Kubernetes version, storage, networking, security policies, mandatory add-ons. Profiles are applied across the fleet.
Lifecycle automation handles provisioning, upgrades, patching, certificate rotation, and backup across hundreds of clusters.
A reconciliation loop drives each cluster back to its defined profile. Mandatory add-ons and guardrails are enforced continuously. Clusters don't drift. In our testing, 100 control planes reconcile in under 150 seconds.
Self-service lets tenants provision and manage their own clusters within operator-defined boundaries: allowed Kubernetes versions, resource quotas, network policies, mandatory components.
Everything available through the dashboard is also available via API and kubectl for automation and GitOps workflows.

What tenants get

Each tenant gets a real, full Kubernetes cluster. Not a namespace, not a virtual cluster sharing a host kernel, not a subset of the API. A real, upstream CNCF compliant Kubernetes cluster with:

Full cluster-admin access. Tenants operate their own cluster without affecting others.
A dedicated datastore instance. Tenant data is not shared in a multi-tenant datastore.
A dedicated kernel, not a shared kernel with namespace boundaries.
An isolated network domain. Tenant traffic is segregated at the platform level.
An independent lifecycle. Each tenant can run a different Kubernetes version, different add-ons, and different policies.

The control plane is consumed as a service. Tenants never touch control plane infrastructure. This is the same operational model as EKS, GKE, or AKS, but on your own bare metal servers.

The economics

The savings add up because kMetal eliminates costs across multiple dimensions simultaneously. Here is what a reference 30-tenant deployment on 10 physical servers looks like:

Control plane machines: 90 dedicated machines (3 per tenant) become 3 shared management nodes. That's a 97% reduction in control plane hardware. Those 87 machines are freed for revenue-generating workloads.

The separate hypervisor layer: Eliminated. No VMware, no Nutanix licenses. No OpenStack, no Proxmox to install, maintain, patch, or operate alongside Kubernetes. Compute isolation is embedded via kernel-native KVM. Network isolation is embedded via software-defined networking. Both are part of the platform.

Operational consolidation: One management tool (kubectl) instead of three (vCenter + network management console + Kubernetes tooling). One skill set (Kubernetes operators) instead of three (hypervisor admins + network admins + Kubernetes operators). One infrastructure stack to monitor, patch, and upgrade.

Provisioning speed: A new tenant cluster (control plane, workers, network isolation) provisions in approximately 20 seconds through a single Kubernetes-native workflow. No multi-stage process of creating VMs, configuring networks, then bootstrapping Kubernetes.

Our internal analysis on on-premises deployments shows: depreciation savings over 90%, SRE operational savings over 65%, productivity improvements over 60%, and energy and facility savings over 50%. The hypervisor elimination saves on licensing and on the people who manage it. The control plane consolidation saves on hardware and on the operational surface. The unified tooling saves on training, hiring, and context switching.

How it connects to Kamaji

Kamaji is the hosted control plane engine at the core of kMetal. It was already open source before kMetal existed, and it stays open source. Apache 2.0, no relicensing, no feature gating.

The relationship is simple:

Kamaji handles hosted control planes on any infrastructure via Cluster API. It's the right tool for organizations that want the HCP architecture on existing infrastructure: VMs, cloud, OpenStack, or bare metal without embedded isolation. It's genuinely capable and production-proven.
kMetal is the full product built around Kamaji for bare metal deployments. It adds what bare metal fleet operations at scale require: embedded compute and network isolation, fleet management with profiles and lifecycle automation, governance and policy enforcement, multi-site high availability, self-service, and enterprise support.

If you need hosted control planes on existing infrastructure, Kamaji does the job. If you need a complete bare metal Kubernetes platform with isolation and fleet management, that's kMetal.

What kMetal is not

A few things kMetal deliberately does not do.

kMetal is not a general-purpose virtualization platform. kMetal uses KVM to isolate Kubernetes tenants on bare metal. It won't run your legacy Windows and Linux servers, your database VMs, or arbitrary virtual machine workloads. It's not in the same category as VMware vSphere, SUSE Harvester, or OpenShift Virtualization. If your primary goal is escaping the Broadcom licensing trap for legacy VM workloads, kMetal isn't the answer. You still need a traditional hypervisor for those, or you containerize them.

Not a Kubernetes distribution. There is no Clastix fork of Kubernetes. Kamaji runs upstream, CNCF compliant Kubernetes. You pick the version. You get the full API. No vendor-specific extensions in the data path.

Essentially, not a "bare metal without virtualization." This one matters because the name can mislead. The "metal" in kMetal means bare metal is your starting point. You buy servers, install kMetal, and the platform handles everything else. But tenant workloads run inside KVM-isolated virtual machines, not directly on hardware. The hypervisor is there. It's just embedded in the platform and invisible to operators. We didn't remove virtualization. We removed it as a separate product you have to buy, install, and staff a team to operate.

Where this goes

We're focused on bare metal because that's where the cost problem is worst and where the architecture delivers the most value. Cloud providers, telcos, enterprises, and AI infrastructure teams running multi-tenant Kubernetes on physical servers: these are the organizations where eliminating the control plane tax and the hypervisor burden changes the economics fundamentally.

The Kamaji project continues to grow in the open. kMetal is the platform we're building for the organizations that need more than an engine. They need the full operational layer around it.

If you run Kubernetes on bare metal and you're paying the three costs we described above: control plane overhead, separate hypervisor stack, and fragmented fleet operations, we'd like to show you what the architecture looks like without them. Reach us at clastix.io

---

Adriano Pezzuto is the founder of Clastix. He has spent more than 10 years working on Kubernetes and more than 25 years in the infrastructure space.