Last 12 weeks · 25 commits
2 of 6 standards met
What Adds an opt-in (Caddyfile: ). When enabled, a peer's currently-open proxied connections are force-closed as soon as an active health check marks it unhealthy, instead of being left to run until they close on their own. Closing the upstream side of each tracked connection makes the existing loops in return, so the session tears down and the client reconnects — and is then routed to a healthy peer. Default is , so existing behavior is unchanged. Per-peer connections are tracked (a small under a mutex) only when the option is enabled, so there's no overhead otherwise. corresponds to (dialPeers dials one connection per peer, in order), which is how connections are associated with peers for tracking/untracking. Why Pairs naturally with active health checks for failover scenarios. For example, with an HTTP health check against a PostgreSQL/Patroni endpoint, when the primary is demoted its health check flips to unhealthy; this option then drops the stale sessions immediately rather than leaving clients pinned to a now read-only/blocked backend. Scope / open questions Triggered on the active health-check unhealthy transition. I deliberately left the passive path out of this PR to keep it focused; happy to extend if you'd prefer it also fire there. Open to naming/placement feedback (field on vs /). Tests : peer track/untrack/close mechanics, active-check closes conns when enabled, leaves them when disabled, and Caddyfile parsing (happy + arg error). passes; / / clean.
What Extends the active health checker to also run on dynamically-discovered upstreams (DNS SRV/A from #429), not just statically-configured ones. Each interval it polls the current dynamic set and health-checks every discovered peer. Why Today active health checks run only on static upstreams (same as the HTTP ). That means a cluster discovered via DNS can't be health-gated. With this change, a discovered cluster becomes usable with health-based routing and no external coordinator — for example: discover the cluster members via /, then gate routing on an HTTP health check (e.g. Patroni's , which returns 200 only on the leader) so the proxy follows the primary without etcd/Consul/a sidecar controller. Notes Discovery for health-checking uses a bare (there is no connection), so dynamic sources used with active health checks must not rely on connection-scoped placeholders. Discovery is otherwise shared with the request path (same cache, same pooled peers), so marks are visible to selection. This intentionally goes beyond , which does not actively health-check dynamic upstreams. Happy to discuss whether you'd prefer it gated behind an option. Stacked on #429** (dynamic upstreams); the diff includes it until that merges, after which I'll rebase. Tests : a dynamic source returns a dead address; after a health-check pass the discovered peer is marked unhealthy. passes; / / clean.
What Adds dynamic upstreams to the layer4 proxy so the backend set can be discovered at runtime instead of being restated in config, with two DNS sources: — resolves SRV records (//). — resolves A/AAAA records for a name, using a configured (fits clusters where every member shares a port, e.g. a Postgres cluster on 5432 behind one name). Caddyfile: . Results are cached per name and refreshed ( / / ). When dynamic upstreams are configured the static list may be empty. Discovered peers come from the shared peer pool, so passive health checks and connection counts persist across refreshes. takes the connection's rather than the connection itself, keeping discovery decoupled from a live connection (and pollable by other callers). Why So the L4 config doesn't have to hard-code endpoints DNS already publishes — the common service-discovery case (Consul DNS, Kubernetes headless services, etc.). Scope / limitations Active** health checks still run only on statically-configured upstreams, same as the HTTP 's dynamic upstreams. Passive health + connection counting apply to discovered upstreams. Mirrors 's / design for consistency. Tests : SRV and A discovery (record → upstream), caching (one lookup for repeated calls), lookup-error handling, SRV , and Caddyfile parsing for both sources (happy + missing/unknown source + bad option). DNS is stubbed via injectable lookups, so no network is needed. passes; / / clean.
What Adds Prometheus metrics to the layer4 proxy, which previously exposed none: — counter of proxied connections — gauge of in-flight proxied connections — gauge (1/0) reflecting active health-check state Why Observability is the one big thing the L4 proxy lacked versus more mature TCP load balancers — no way to see connection counts or upstream health from metrics. These three cover the basics (load + health) per upstream. Notes (re: de-duplication) This deliberately does not copy the HTTP server's metrics code — it uses the same mechanism ( + ), so collectors live on the per-instance registry, reset cleanly across reloads, and surface on the existing admin endpoint. Updates are nil-safe (a never-provisioned handler doesn't panic). Labels are kept to a single to bound cardinality; happy to adjust names/labels to your conventions. Tests : counter/gauge increments and the open/close lifecycle, the health gauge, nil-safety, and that an active health check sets the health gauge to 0 (dead peer) and 1 (live listener). passes; / / clean. go.mod change is just promoting from indirect to direct.
What Adds weighted load balancing to the layer4 proxy: (Caddyfile: inside an block). A value is treated as . selection policy — smooth weighted round-robin (the same algorithm nginx uses), skipping unavailable peers. Existing selection policies and configurations are unaffected. Example: Why Backends rarely have identical capacity. Weighted round-robin lets a bigger node take proportionally more traffic — one of HAProxy's most-used features, and something layer4's existing policies (random / round_robin / least_conn / first / ip_hash) couldn't express. Tests : exact proportional distribution over a full WRR period, default-to-1 behavior, skipping unavailable upstreams, all-down returns nil, and Caddyfile parsing (policy + , happy + error paths). passes; / / clean.
Repository: mholt/caddy-l4. Description: Layer 4 (TCP/UDP) app for Caddy Stars: 1700, Forks: 111. Primary language: Go. Languages: Go (100%). License: Apache-2.0. Latest release: v0.1.1 (1mo ago). Open PRs: 15, open issues: 23. Last activity: 13h ago. Community health: 42%. Top contributors: vnxme, dependabot[bot], mholt, WeidiDeng, ydylla, IceCodeNew, mohammed90, RussellLuo, kkroo, francislavoie and others.