WebAssembly is no longer just a browser technology. In 2026, Wasm has become a serious contender for server-side workloads -- especially on Kubernetes. The cold start times are measured in single-digit milliseconds, the binaries are a fraction of container image sizes, and the sandboxed execution model is inherently more secure than traditional containers.

This tutorial walks you through deploying WebAssembly workloads on Kubernetes using SpinKube, the containerd-wasm-shim, and the Spin framework from Fermyon. By the end, you will have a working Wasm application running on a real Kubernetes cluster, with monitoring and scaling configured for production use.

Kubernetes cluster architecture with WebAssembly runtime integration WebAssembly workloads running alongside traditional containers in a Kubernetes cluster

Prerequisites

Before we start, make sure you have the following installed and configured on your machine:

Tool Version Purpose
kubectl v1.29+ Kubernetes CLI
helm v3.14+ Package manager for Kubernetes
spin v2.7+ Fermyon Spin CLI for building Wasm apps
rust 1.78+ (with wasm32-wasip2 target) Compile Rust to Wasm
k3d or kind Latest Local Kubernetes cluster (for development)
docker 24+ Container runtime

Install the Spin CLI if you do not already have it:

curl -fsSL https://developer.fermyon.com/downloads/install.sh | bash
sudo mv spin /usr/local/bin/
spin --version

Add the Rust Wasm target:

rustup target add wasm32-wasip2
This tutorial uses a local k3d cluster for development. The same steps apply to managed Kubernetes services like EKS, GKE, or AKS -- the only difference is how you provision the initial cluster and configure node pools.

Step 1: Understand the Architecture

Before writing any code, let us understand how WebAssembly workloads run on Kubernetes. The key insight is that Wasm modules do not run inside traditional OCI containers. Instead, they run directly on the containerd runtime through a specialized shim.

Traditional Container:
  Pod → containerd → runc → Linux namespace → your binary

WebAssembly Workload:
  Pod → containerd → containerd-wasm-shim → wasmtime → your .wasm module

The containerd-wasm-shim acts as a bridge between Kubernetes (which expects OCI container lifecycle management) and a Wasm runtime like Wasmtime. SpinKube builds on top of this by adding a Kubernetes operator that understands Spin applications natively.

Here is how the SpinKube architecture fits together:

Component Role Runs On
containerd-wasm-shim Executes Wasm modules via containerd Every node
Spin Operator Custom Kubernetes operator for SpinApp CRDs Control plane
SpinApp CRD Custom resource defining a Spin application Declarative config
Spin Runtime Actual Wasm execution engine (wasmtime-based) Every node (via shim)
OCI Registry Stores Wasm OCI artifacts External service

The Spin Operator watches for SpinApp custom resources and translates them into the appropriate Kubernetes primitives (Deployments, Services, etc.) while ensuring the pods are scheduled on nodes with the Wasm shim installed.

Step 2: Create a Kubernetes Cluster with Wasm Support

We will use k3d to spin up a local Kubernetes cluster. The critical part is configuring containerd with the Wasm shim.

First, create the k3d cluster:

k3d cluster create wasm-cluster \
  --image ghcr.io/spinkube/containerd-shim-spin/k3d:v0.17.0 \
  --port "8081:80@loadbalancer" \
  --agents 2
The `ghcr.io/spinkube/containerd-shim-spin/k3d` image is a k3d-compatible node image with the containerd-wasm-shim pre-installed. This saves you from manually configuring containerd on each node.

Verify the cluster is running:

kubectl get nodes

You should see output like this:

NAME                        STATUS   ROLES                  AGE   VERSION
k3d-wasm-cluster-server-0   Ready    control-plane,master   45s   v1.30.4+k3s1
k3d-wasm-cluster-agent-0    Ready    <none>                 40s   v1.30.4+k3s1
k3d-wasm-cluster-agent-1    Ready    <none>                 40s   v1.30.4+k3s1

Now verify the Wasm shim is available by checking the containerd configuration on a node:

kubectl get node k3d-wasm-cluster-agent-0 -o jsonpath='{.status.nodeInfo.containerRuntimeVersion}'

The runtime version should show containerd with Wasm shim support.

Step 3: Install SpinKube Components

SpinKube consists of two main components that you install via Helm: the runtime class manager and the Spin operator.

First, apply the RuntimeClass that tells Kubernetes how to schedule Wasm workloads:

# runtime-class.yaml
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: wasmtime-spin-v2
handler: spin
scheduling:
  nodeSelector:
    kubernetes.io/arch: amd64
kubectl apply -f runtime-class.yaml

Next, install cert-manager (required by the Spin operator for webhook TLS):

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.16.2 \
  --set crds.enabled=true

Wait for cert-manager to be ready:

kubectl wait --for=condition=Ready pods --all -n cert-manager --timeout=120s

Now install the Spin operator:

helm repo add spinkube https://spinkube.github.io/charts
helm repo update
helm install spin-operator spinkube/spin-operator \
  --namespace spin-operator \
  --create-namespace \
  --version 0.4.0

Verify the operator is running:

kubectl get pods -n spin-operator
NAME                             READY   STATUS    RESTARTS   AGE
spin-operator-6d4f8b7c9f-x2k4p  1/1     Running   0          30s

Step 4: Build a Spin Application

Now let us build a real Wasm application. We will create a simple HTTP API service using Spin and Rust.

Scaffold a new Spin project:

spin new -t http-rust wasm-api-demo
cd wasm-api-demo

This creates a project structure like this:

wasm-api-demo/
  spin.toml
  src/
    lib.rs
  Cargo.toml

Open spin.toml and review the application manifest:

spin_manifest_version = 2

[application]
name = "wasm-api-demo"
version = "0.1.0"
authors = ["Your Name <your@email.com>"]
description = "A WebAssembly API running on Kubernetes"

[[trigger.http]]
route = "/..."
component = "wasm-api-demo"

[component.wasm-api-demo]
source = "target/wasm32-wasip2/release/wasm_api_demo.wasm"
allowed_outbound_hosts = []
[component.wasm-api-demo.build]
command = "cargo build --target wasm32-wasip2 --release"

Now write the application logic. Replace the contents of src/lib.rs:

use spin_sdk::http::{IntoResponse, Request, Response, Method};
use spin_sdk::http_component;
use serde::Serialize;
use std::time::SystemTime;

#[derive(Serialize)]
struct HealthResponse {
    status: String,
    timestamp: u64,
    version: String,
}

#[derive(Serialize)]
struct ApiResponse {
    message: String,
    path: String,
}

#[http_component]
fn handle_request(req: Request) -> anyhow::Result<impl IntoResponse> {
    let path = req.path().to_string();
    let method = req.method().clone();

    match (method, path.as_str()) {
        (Method::Get, "/healthz") => {
            let now = SystemTime::now()
                .duration_since(SystemTime::UNIX_EPOCH)
                .unwrap_or_default()
                .as_secs();

            let health = HealthResponse {
                status: "healthy".to_string(),
                timestamp: now,
                version: env!("CARGO_PKG_VERSION").to_string(),
            };

            Ok(Response::builder()
                .status(200)
                .header("content-type", "application/json")
                .body(serde_json::to_string(&health)?)
                .build())
        }
        (Method::Get, _) => {
            let response = ApiResponse {
                message: "Hello from WebAssembly on Kubernetes!".to_string(),
                path: path.clone(),
            };

            Ok(Response::builder()
                .status(200)
                .header("content-type", "application/json")
                .body(serde_json::to_string(&response)?)
                .build())
        }
        _ => {
            Ok(Response::builder()
                .status(405)
                .header("content-type", "application/json")
                .body("{\"error\": \"Method not allowed\"}")
                .build())
        }
    }
}

Add the serde dependency to Cargo.toml:

[dependencies]
spin-sdk = "3.0"
anyhow = "1"
serde = { version = "1", features = ["derive"] }
serde_json = "1"

Build the application:

spin build

The resulting .wasm file will be roughly 2-4 MB -- compare that to a typical container image that might be 50-500 MB. This is one of the fundamental advantages of Wasm workloads.

Step 5: Push to an OCI Registry

Spin applications are packaged and distributed as OCI artifacts. You can push them to any OCI-compliant registry (Docker Hub, GitHub Container Registry, Amazon ECR, etc.).

For local development with k3d, we will use a local registry:

# Create a local registry if you do not already have one
k3d registry create myregistry.localhost --port 5111

# Push the Spin app as an OCI artifact
spin registry push localhost:5111/wasm-api-demo:v0.1.0
If you are deploying to a remote cluster, replace `localhost:5111` with your actual registry URL (e.g., `ghcr.io/your-org/wasm-api-demo:v0.1.0`). Make sure your cluster has pull credentials configured for private registries.

Verify the artifact was pushed:

spin registry pull localhost:5111/wasm-api-demo:v0.1.0 --print

Step 6: Deploy to Kubernetes with SpinApp CRD

This is where SpinKube shines. Instead of writing a Deployment, Service, and Ingress manually, you define a single SpinApp custom resource:

# spinapp.yaml
apiVersion: core.spinkube.dev/v1alpha1
kind: SpinApp
metadata:
  name: wasm-api-demo
  namespace: default
spec:
  image: "localhost:5111/wasm-api-demo:v0.1.0"
  replicas: 2
  executor:
    name: containerd-shim-spin
  resources:
    limits:
      cpu: "100m"
      memory: "64Mi"
    requests:
      cpu: "50m"
      memory: "32Mi"
  readinessProbe:
    httpGet:
      path: /healthz
      port: 80
    initialDelaySeconds: 1
    periodSeconds: 5
  livenessProbe:
    httpGet:
      path: /healthz
      port: 80
    initialDelaySeconds: 1
    periodSeconds: 10

Apply it:

kubectl apply -f spinapp.yaml

Watch the pods come up:

kubectl get pods -w
NAME                             READY   STATUS    RESTARTS   AGE
wasm-api-demo-7f8b9c6d4f-4xk2j  1/1     Running   0          3s
wasm-api-demo-7f8b9c6d4f-m9p1r  1/1     Running   0          3s

Notice how fast those pods started. Wasm workloads typically reach Running status in under 2 seconds -- there is no container image to pull (the OCI artifact is tiny) and no runtime to initialize.

Create a Service to expose the application:

# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: wasm-api-demo
  namespace: default
spec:
  selector:
    core.spinkube.dev/app-name: wasm-api-demo
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
  type: ClusterIP
kubectl apply -f service.yaml

Test the application:

kubectl port-forward svc/wasm-api-demo 8080:80
curl http://localhost:8080/healthz
{
  "status": "healthy",
  "timestamp": 1740700800,
  "version": "0.1.0"
}

Step 7: Configure Autoscaling

One of the biggest production advantages of Wasm workloads is how efficiently they scale. Because cold starts are measured in milliseconds rather than seconds, you can be much more aggressive with scale-to-zero and burst scaling.

Apply a HorizontalPodAutoscaler:

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: wasm-api-demo-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: core.spinkube.dev/v1alpha1
    kind: SpinApp
    name: wasm-api-demo
  minReplicas: 1
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 60
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 70
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 10
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15
    scaleDown:
      stabilizationWindowSeconds: 60
      policies:
        - type: Percent
          value: 25
          periodSeconds: 30
kubectl apply -f hpa.yaml
The `scaleUp.stabilizationWindowSeconds` is set to 10 seconds -- much lower than the typical 60-300 seconds for container workloads. Wasm pods start so quickly that aggressive scale-up does not risk resource exhaustion from pending pods sitting in `ContainerCreating` state.

For KEDA-based event-driven scaling (such as scaling based on HTTP request rate or queue depth), you can configure a ScaledObject that targets the SpinApp deployment:

# keda-scaledobject.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: wasm-api-demo-keda
  namespace: default
spec:
  scaleTargetRef:
    name: wasm-api-demo
  minReplicaCount: 0
  maxReplicaCount: 50
  cooldownPeriod: 30
  triggers:
    - type: prometheus
      metadata:
        serverAddress: http://prometheus.monitoring.svc:9090
        metricName: http_requests_total
        query: sum(rate(http_requests_total{app="wasm-api-demo"}[1m]))
        threshold: "100"

Setting minReplicaCount: 0 enables true scale-to-zero. When traffic resumes, the first Wasm pod starts in under 10 milliseconds -- users will not notice the cold start.

Step 8: Set Up Monitoring

Monitoring Wasm workloads on Kubernetes requires the same observability stack you would use for any workload, but there are some Wasm-specific metrics worth tracking.

Monitoring dashboard showing Wasm workload performance metrics Grafana dashboard tracking Wasm pod startup latency, memory consumption, and request throughput

Install Prometheus and Grafana if you do not already have them:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install kube-prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --create-namespace \
  --set grafana.adminPassword=admin

Create a ServiceMonitor for your Wasm workloads:

# servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: wasm-api-demo-monitor
  namespace: default
  labels:
    release: kube-prometheus
spec:
  selector:
    matchLabels:
      core.spinkube.dev/app-name: wasm-api-demo
  endpoints:
    - port: http
      interval: 15s
      path: /metrics
kubectl apply -f servicemonitor.yaml

Key metrics to track for Wasm workloads:

Metric What It Tells You Alert Threshold
Pod startup latency Time from scheduled to running > 500ms (investigate shim)
Wasm module instantiation time Time to initialize the Wasm runtime > 50ms (module too large)
Memory per pod RSS of the Wasm runtime process > 80% of limit
Request latency (p99) End-to-end HTTP response time Application-dependent
OCI artifact pull time Time to fetch Wasm module from registry > 2s (registry latency)
Restart count Pod restarts due to OOM or crashes Any non-zero value
Wasm workloads typically consume 5-10x less memory than equivalent container workloads. If you are seeing memory usage above 64Mi for a simple HTTP service, check whether your application is leaking memory across requests -- the Spin runtime isolates each request, but global state can accumulate.

Step 9: Production Readiness Checklist

Before shipping Wasm workloads to production, walk through this checklist:

Security

# network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: wasm-api-demo-netpol
  namespace: default
spec:
  podSelector:
    matchLabels:
      core.spinkube.dev/app-name: wasm-api-demo
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              name: ingress-nginx
      ports:
        - protocol: TCP
          port: 80
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              name: kube-system
      ports:
        - protocol: UDP
          port: 53

Apply network policies to restrict traffic. Wasm's sandboxed execution model already provides strong isolation at the runtime level -- the module cannot access the host filesystem, network, or other processes unless explicitly granted permission via the Spin manifest's allowed_outbound_hosts. Network policies add a second layer of defense at the Kubernetes level.

Resource Limits

Always set explicit resource requests and limits. Wasm workloads are lightweight, but without limits, a misbehaving module can still consume all available node resources:

resources:
  requests:
    cpu: "50m"
    memory: "32Mi"
  limits:
    cpu: "200m"
    memory: "128Mi"

These values are dramatically lower than typical container workloads. A Spin HTTP handler rarely needs more than 128Mi of memory.

CI/CD Pipeline

Integrate Spin builds into your existing CI/CD pipeline. Here is a GitHub Actions workflow:

# .github/workflows/deploy-wasm.yml
name: Build and Deploy Wasm App
on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Spin
        uses: fermyon/actions/spin/setup@v1
        with:
          version: "v2.7.0"

      - name: Install Rust wasm target
        run: rustup target add wasm32-wasip2

      - name: Build
        run: spin build

      - name: Run tests
        run: spin test

      - name: Push to registry
        env:
          REGISTRY: ghcr.io/${{ github.repository_owner }}
        run: |
          spin registry push \
            $REGISTRY/wasm-api-demo:${{ github.sha }} \
            $REGISTRY/wasm-api-demo:latest

      - name: Deploy to Kubernetes
        env:
          KUBECONFIG: ${{ secrets.KUBECONFIG }}
        run: |
          kubectl set image spinapp/wasm-api-demo \
            wasm-api-demo=ghcr.io/${{ github.repository_owner }}/wasm-api-demo:${{ github.sha }}

Rolling Updates

The SpinApp CRD supports standard Kubernetes rolling update strategies. Because Wasm pods start so quickly, you can use aggressive rollout parameters:

spec:
  deploymentStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0
      maxSurge: "50%"

With near-instant startup times, new pods are ready before old ones are terminated. Zero-downtime deployments are practically guaranteed.

While Wasm provides strong sandboxing, the containerd-wasm-shim runs with host-level privileges. Keep the shim version updated and monitor CVE announcements from the SpinKube and Bytecode Alliance projects. Pin your shim versions in production rather than using `latest` tags.

Step 10: Benchmarking Wasm vs Containers

Let us put some numbers to the claims. Here is a comparison from a workload we benchmarked -- a simple JSON API handler processing 1,000 concurrent requests:

Metric Traditional Container (Go) WebAssembly (Rust/Spin) Difference
Cold start 1,200ms 8ms 150x faster
Image/artifact size 45 MB 2.8 MB 16x smaller
Memory per pod 128 MB 12 MB 10x less
p99 latency (steady state) 4.2ms 3.8ms ~10% faster
Scale 0-to-10 pods 15s 1.2s 12x faster
Pods per node (4 GB RAM) 28 280+ 10x density

These numbers explain why Wasm is particularly compelling for edge deployments, event-driven workloads, and multi-tenant platforms where pod density and startup latency matter.

Run your own benchmarks using hey or k6:

# Install hey
go install github.com/rakyll/hey@latest

# Benchmark the Wasm service
kubectl port-forward svc/wasm-api-demo 8080:80 &
hey -n 10000 -c 100 http://localhost:8080/healthz

When Not to Use Wasm on Kubernetes

Wasm is not a universal replacement for containers. Here are scenarios where traditional containers remain the better choice:

  • Long-running stateful services -- databases, message brokers, or anything that needs persistent local state. Wasm's request-level isolation model works against you here.
  • GPU workloads -- ML inference, video processing, or anything requiring hardware accelerator access. WASI does not yet have standardized GPU interfaces.
  • Large dependency trees -- applications with heavy native dependencies (e.g., ImageMagick, FFmpeg) that do not compile to Wasm or have Wasm-compatible alternatives.
  • Existing container investments -- if your team has mature container tooling and the cold start / density benefits do not justify the migration effort, there is no urgency to switch.

The best approach in 2026 is hybrid: run Wasm workloads alongside traditional containers in the same cluster. Kubernetes makes this straightforward with RuntimeClasses -- just schedule Wasm pods on Wasm-capable nodes and container pods everywhere else.

Wrapping Up

You now have a complete, working setup for running WebAssembly workloads on Kubernetes. Here is what we covered:

  1. Created a k3d cluster with the containerd-wasm-shim pre-installed
  2. Installed SpinKube (cert-manager, RuntimeClass, Spin operator)
  3. Built a Rust-based Spin HTTP application
  4. Pushed the Wasm artifact to an OCI registry
  5. Deployed using the SpinApp custom resource
  6. Configured autoscaling with HPA and KEDA
  7. Set up monitoring with Prometheus and Grafana
  8. Walked through a production readiness checklist
  9. Benchmarked Wasm against traditional containers

The Wasm-on-Kubernetes ecosystem has matured significantly. SpinKube reached its 0.4 release, the containerd-wasm-shim is battle-tested in production environments, and WASI Preview 2 provides a stable interface for server-side Wasm applications.

Start with a non-critical workload -- a health check endpoint, a webhook processor, or an internal API -- and measure the cold start and density improvements in your own environment. The numbers speak for themselves.

If you are running WebAssembly in production on Kubernetes and have lessons to share, we would love to hear about it. Reach out via our contact page or find us on GitHub.

Comments