kube-snorlax — Scale-to-Zero for Kubernetes

💸 The Problem

Most clusters run dozens of services 24/7, but only a few are actively used at any time. The rest burn resources doing nothing.

❌ Without kube-snorlax

📦 grafana 200m / 512Mi 24/7
📦 wiki-js 100m / 256Mi 24/7
📦 pgadmin 100m / 256Mi 24/7
📦 gitea 150m / 384Mi 24/7
📦 code-server 250m / 512Mi 24/7
📦 +25 more always running, rarely used

✅ With kube-snorlax

📦 grafana has alerts — keep on always on
😴 wiki-js 0m / 0Mi sleeping
😴 pgadmin 0m / 0Mi sleeping
😴 gitea 0m / 0Mi sleeping
😴 code-server 0m / 0Mi sleeping
😴 +25 sleeping wake on visit 0 resources

⚡ How It Works

Two parts: sleeping (scheduled scale-down) and waking (on-demand scale-up). The whole cycle is automatic.

wiki.example.com

😴

wiki-js is sleeping

Scaled to 0 replicas by kube-downscaler
Saving 100m CPU and 256Mi RAM

wiki-js

Waking up wiki-js...

Elapsed 0s

ETA ~22s

Checking every 2 seconds

✓

wiki-js

Ready! Woke up in 22s

Redirecting...

😴

Service goes to sleep

kube-downscaler reads the downscaler/uptime annotation. Outside the configured window, it scales replicas to 0. Pod terminates, resources freed.

🌐

User visits the URL

Hours later, someone opens wiki.example.com in their browser. The request hits the ingress controller.

❌

Ingress gets a 503

No endpoints exist (0 replicas). The custom-http-errors: "503" annotation tells NGINX to forward to kube-snorlax.

⚡

kube-snorlax wakes the service

Reads the X-Service-Name header, patches deployment replicas to 1 via the K8s API, and sets a downscaler/last-wakeup timestamp.

⏳

User sees a waking page with ETA

A clean loading page with a live elapsed timer, estimated wake time based on that service's history, and a progress bar. No more wondering if something is broken — you see exactly how long it'll take.

✅

Service is ready

Pod starts (~15–60s). Next refresh routes through ingress normally — user lands on the real app. Grace period prevents immediate re-sleep.

🧩 Why kube-snorlax

🪶

Lightweight

Single pod. ~25m CPU, ~64Mi RAM. Flask + gunicorn. No operator, no CRDs, no sidecars.

🔒

Minimal RBAC

Only needs get + patch on deployments. No cluster-admin, no secrets access.

📦

Helm Chart

Install in one command. Published to GitHub Pages and GHCR as OCI artifact.

🔌

Works with anything

NGINX, Traefik, MetalLB, cloud LBs. EKS, GKE, AKS, k3s, kind, minikube.

🤝

GitOps friendly

Works with ArgoCD and FluxCD. ignoreDifferences pattern for replica conflicts.

⏰

Flexible scheduling

Per-service uptime windows with timezone support. Grace period prevents re-sleep after wake.

🚀 Quick Install

Three ways to get started. Pick whichever suits your setup.

# Add the Helm repo
helm repo add kube-snorlax https://vineethvijay.github.io/kube-snorlax
helm repo update

# Install
helm install kube-snorlax kube-snorlax/kube-snorlax

# No repo add needed — pull directly from GHCR
helm install kube-snorlax oci://ghcr.io/vineethvijay/charts/kube-snorlax --version 1.0.0

git clone https://github.com/vineethvijay/kube-snorlax.git
helm install kube-snorlax ./kube-snorlax/helm/kube-snorlax

🔗 Works With

🌐

NGINX Ingress

Native support via custom-http-errors

🔀

Traefik

Error pages middleware

⚖️

MetalLB

L2 and BGP modes

☁️

Cloud LBs

AWS ALB/NLB, GCP, Azure

📉

kube-downscaler

Schedule-based scale-to-zero

📊

KEDA

Cron trigger for scaling

🔄

ArgoCD / FluxCD

GitOps with ignoreDifferences

☸️

Any Cluster

EKS, GKE, AKS, k3s, kind