A production-grade L4 TCP load balancer written in Go, designed with the architecture patterns used in real infrastructure software like Envoy, HAProxy, and nginx.
┌─────────────────────────────────────────────────────────────────┐
│ Loble │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Listener │──▶│ Data Plane │───▶│ Backend Selection │ │
│ │ (TCP:9000) │ │ (lock-free)│ │ (RR / Random) │ │
│ └─────────────┘ └──────┬──────┘ └─────────────────────┘ │
│ │ │
│ ┌──────▼──────┐ │
│ │ Snapshot │◀── atomic.Pointer │
│ │ (immutable)│ │
│ └──────▲──────┘ │
│ │ │
│ ┌─────────────────────────┴─────────────────────────────────┐ │
│ │ Control Plane │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌────────────────┐ │ │
│ │ │Health Tracker│ │ Snapshot │ │ State Machine │ │ │
│ │ │ (passive) │ │ Builder │ │ (H→D→U / U→H) │ │ │
│ │ └──────────────┘ └──────────────┘ └────────────────┘ │ │
│ └───────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
- Lock-free hot path — Data plane reads snapshots atomically, no mutexes on the request path
- Graceful shutdown — Drains existing connections before exit
- Backend health tracking — Passive health checks with state machine (Healthy → Draining → Unhealthy)
- Connection lifecycle tracking — Properly counts in-flight connections per backend
- Multiple balancing algorithms — Round Robin and Random
- Configurable via TOML — Simple configuration format
- Prometheus metrics — Built-in metrics endpoint
Boot
└─▶ Load config (TOML)
└─▶ Init logging & metrics
└─▶ Start control plane (health tracking + snapshot store)
└─▶ Start data plane + TCP listener
└─▶ Handle traffic
On SIGTERM/SIGINT:
└─▶ Stop accepting new connections
└─▶ Drain existing connections (with timeout)
└─▶ Exit cleanly
For each incoming TCP connection:
Accept connection
│
▼
Read immutable snapshot (lock-free)
│
▼
Select backend (Round Robin / Random)
│
▼
Track connection: ConnStarted()
│
▼
Bidirectional proxy with deadlines
│
▼
Track connection: ConnFinished()
│
▼
Report success/failure to health tracker
Properties:
- Lock-free hot path via
atomic.Pointer - No goroutine leaks (WaitGroup tracking)
- No file descriptor leaks (deferred Close)
- Bounded timeouts on all I/O
- Correct half-close behavior
Backend Health States:
┌─────────────────────────────────────────────┐
│ │
▼ │
┌───────┐ failures >= threshold ┌──────────┐ │
│Healthy│─────────────────────────▶│ Draining │ │
└───────┘ (has active conns) └────┬─────┘ │
▲ │ │
│ │ conns │
│ successes >= threshold │ == 0 │
│ ▼ │
│ ┌───────────┐ │
└────────────────────────────│ Unhealthy │───┘
└───────────┘
Signals that drive state transitions:
- Passive failures (from real traffic errors)
- Passive successes (successful proxied requests)
- Connection lifecycle events (start/finish)
Guarantees:
- No flapping — Requires consecutive failures/successes to transition
- No churn — Snapshot rebuilds only on actual state changes
- Safe draining — Backends removed only when active connections reach zero
This is the key architectural pattern — essentially Read-Copy-Update (RCU) applied to load balancing:
// Health state is MUTABLE (control plane only)
type HealthTracker struct {
backends map[string]*BackendHealth // mutated by control plane
}
// Snapshots are IMMUTABLE (shared safely)
type Snapshot struct {
Version uint64
Backends []Backend // never modified after creation
}
// Data plane reads lock-free via atomic pointer
type SnapshotStore struct {
current atomic.Pointer[backend.Snapshot]
}Flow:
- Control plane detects health state change
- New snapshot is built with only healthy backends
- Atomic pointer swap publishes new snapshot
- Data plane reads new snapshot on next request
- Old snapshot is garbage collected when no longer referenced
Two independent draining systems:
| Type | Trigger | Behavior |
|---|---|---|
| Backend Draining | Health failures | Stop sending new traffic, wait for existing connections, remove when safe |
| Process Draining | SIGTERM/SIGINT | Stop accepting new connections, wait for in-flight to complete, hard timeout guard |
- Go 1.24+
git clone https://github.com/yourusername/loble.git
cd loble
go build -o loble ./cmd/lbCreate a config.toml:
[listener]
address = "0.0.0.0:9000"
[balancer]
algorithm = "round_robin" # or "random"
[health]
failure_threshold = 3 # consecutive failures to mark unhealthy
success_threshold = 2 # consecutive successes to recover
[metrics]
enabled = true
address = "127.0.0.1:9090"
[logging]
level = "info"
[[backends]]
address = "10.0.0.1:8080"
weight = 10
[[backends]]
address = "10.0.0.2:8080"
weight = 5./loble -config config.tomlWhen enabled, Prometheus metrics are available at http://127.0.0.1:9090/metrics
loble/
├── cmd/lb/
│ └── main.go # Entry point, lifecycle management
├── internal/
│ ├── backend/
│ │ ├── backend.go # Backend struct
│ │ └── snapshot.go # Immutable snapshot struct
│ ├── balancer/
│ │ ├── balancer.go # Balancer interface
│ │ ├── round_robin.go # Round-robin implementation
│ │ └── random.go # Random selection implementation
│ ├── config/
│ │ └── config.go # TOML config loading
│ ├── controlplane/
│ │ ├── controlplane.go # Control plane orchestration
│ │ ├── health.go # Health state machine
│ │ ├── snapshot_builder.go
│ │ └── snapshot_store.go # Atomic snapshot storage
│ ├── dataplane/
│ │ ├── dataplane.go # Backend selection logic
│ │ ├── listener.go # TCP accept loop & proxy
│ │ └── health.go # Health reporter interface
│ ├── logging/
│ │ └── logging.go # Structured logging
│ └── metrics/
│ └── metrics.go # Prometheus metrics
└── config.toml
This design scales to:
- Thousands of connections — Lock-free reads, minimal contention
- Dozens of backends — Efficient health tracking and snapshot rebuilds
- Config reloads — Atomic snapshot swaps (foundation is there)
- Zero-downtime deploys — Graceful shutdown with connection draining
Future extensions this architecture supports:
- TLS termination
- L7 routing (HTTP path-based)
- Active health checks
- Weighted load balancing
- Circuit breaking
Architectural patterns inspired by:
- Envoy Proxy — Thread-local snapshots, xDS
- HAProxy — Health checking, connection draining
- Linux RCU — Read-Copy-Update pattern
MIT