AUTONOMY DIRECTORATE

๐Ÿ  Main

๐Ÿงช Interactive Apps

๐Ÿ“ฐ News

๐Ÿ‘ค Account

โŸจ QUANTUM ERROR PORTAL โŸฉ

Navigate the Error Dimensions

PQ Crypta Logo

HMAC-Protected Circuit Breaker

Security-Aware Rust Crate — Fail-Open on Tamper, HMAC-SHA256 State Integrity

Crate Version License Rust HMAC-SHA256

Project Information

Crate
hmac-circuit-breaker
Version
0.3.0
Language
Rust (edition 2021, MSRV 1.75)
License
MIT
Repository
GitHub
crates.io
hmac-circuit-breaker
docs.rs
docs.rs/hmac-circuit-breaker
Used at
PQ Crypta — per-algorithm circuit protection

Why This Crate Exists

Most circuit breaker crates keep state in memory and reset on restart. Some persist state to disk — but none of them ask the question:

What happens if someone writes a plausible-looking state file with every circuit “tripped”?

This crate is the answer. It adds HMAC-SHA256 integrity to on-disk state and makes a deliberate, security-first choice on failure: it fails open (clears all circuits) rather than failing closed (blocking all traffic). That single decision prevents an attacker from weaponising the circuit breaker as a denial-of-service amplifier.

Scope & Design For

FitScenario
Security-sensitive services where the state file is on shared or world-writable storage
Systems with a separate health-check process that writes state (cron, daemon, sidecar)
Axum-based APIs needing per-service circuit enforcement as a tower::Layer
Environments where self-DoS via state-file manipulation is a credible threat
Systems that need circuit state to survive application restarts

Skip This Crate When

FitScenario
Your circuit breaker needs in-process, in-memory detection only and you have no shared state file — use failsafe instead
The producer and consumer are the same process sharing memory directly

Key Features

Core Capabilities

CapabilityDescription
In-process failure detectionMiddleware counts consecutive 5xx responses and trips the circuit immediately, without waiting for the next health-check cycle
Automatic half-open probingAfter configurable cooldown, one probe request is allowed through; success closes the circuit, failure restarts the cooldown
HMAC-protected persistenceCircuit state written to disk with an embedded HMAC-SHA256 tag; every reload verifies the tag before trusting any state
Fail-open on tamperA bad HMAC clears all in-memory state rather than tripping every circuit; an attacker with write access can at most temporarily remove protection, not weaponise it

Implementation Details

DetailDescription
Atomic file writesState written to a .tmp sibling then renamed into place; readers never see a partial write
Constant-time MAC comparisonHMAC tags compared using the audited subtle crate (ConstantTimeEq); no early exit, no timing oracle
Per-service granularityEach named service has independent circuit state; one tripped service does not affect others
Axum / Tower middlewareDrop-in circuit_breaker_layer wraps any axum Router with zero boilerplate
Bypass header with secretConfigurable header lets the health-check cron re-probe tripped services; the secret prevents bypass via header-name disclosure
Graceful shutdownspawn_reload() returns a JoinHandle so the background task can be aborted on shutdown

Hello World — Complete Loop in ~10 Lines

Rust
use hmac_circuit_breaker::{
    CircuitBreakerConfig, CircuitBreakerHandle,
    writer::{write_state, ServiceObservation},
    state::CircuitBreakerFile,
};
use std::{collections::BTreeMap, path::Path};

// โ”€โ”€ Producer (health-check cron) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
let path = Path::new("/run/app/cb.json");
let prev = std::fs::read_to_string(path).ok()
    .and_then(|s| serde_json::from_str::<CircuitBreakerFile>(&s).ok())
    .map(|f| f.algorithms).unwrap_or_default();

write_state(path, &[
    ServiceObservation { name: "db".into(), passed: false, error: Some("timeout".into()) },
], &prev, 3, "my-secret")?;

// โ”€โ”€ Consumer (API server) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
let handle = CircuitBreakerHandle::new(
    CircuitBreakerConfig::builder().state_file(path.into()).secret("my-secret").build()
);
handle.load().await;
let _reload = handle.spawn_reload(); // background reload every 60 s (returns JoinHandle)

if handle.is_tripped("db").await { /* reject request with 503 */ }

Installation

Cargo.toml
[dependencies]
hmac-circuit-breaker = "0.3"

# With axum middleware:
hmac-circuit-breaker = { version = "0.3", features = ["axum"] }

Dual-Layer Architecture

Circuit state is tracked in two complementary layers that operate independently. Either layer alone can block a request with 503. The external producer handles planned downtime; the in-process detector catches transient failures between health-check cycles.

 ┌───────────────────────────────────────────────────────┐
 │  Health-check process (producer)                        │
 │                                                         │
 │  1. Probe each service                                  │
 │  2. Load previous state from disk (accumulate failures) │
 │  3. Sort algorithms map, compact-serialise, HMAC-SHA256 │
 │  4. Write circuit_breaker.json atomically (tmp + rename)│
 └────────────────────────├────────────────────────────────────┘
                          │  on-disk JSON
 ┌────────────────────────┬────────────────────────────────────┐
 │  API server (consumer)                                  │
 │                                                         │
 │  Layer 1 — File-based state (SharedState)               │
 │    Background task reloads file every 60 s:             │
 │    • Verify HMAC — on mismatch: clear state (fail-open) │
 │    • Update Arc<RwLock<HashMap>> in-memory state        │
 │                                                         │
 │  Layer 2 — In-process runtime state (RuntimeState)      │
 │    Middleware tracks 5xx responses per service:         │
 │    • threshold consecutive 5xx → trip immediately       │
 │    • No waiting for the next health-check cycle         │
 │    • Half-open probing auto-recovers after cooldown     │
 │                                                         │
 │  Per-request middleware (both layers checked):          │
 │    • Extract service name from URL path                 │
 │    • File state Tripped  → 503 immediately              │
 │    • Runtime state Tripped → 503 immediately            │
 │    • bypass header + secret → pass through (health cron)│
 └─────────────────────────────────────────────────────────────┘

State Machines

File-Based State (External Health-Check Producer)
         fail                       fail×threshold
  ┌──────────┐ ───────────▶ ┌──────────┐ ────────────────▶ ┌─────────┐
  │ Closed   │              │  Open    │                   │ Tripped │
  │ (normal) │              │          │                   │  (503)  │
  └──────────┘              └──────────┘                   └────├────┘
        ↑                                                        │
        └──────────── pass written (any state) ─────────────┴
Closed — normal operation; all requests pass through
Open — failures below threshold; requests still pass
Tripped — consecutive failures ≥ threshold; requests get 503
In-Process Runtime State (Axum Middleware)
         fail×threshold                  cooldown expires
  ┌──────────┐ ──────────────▶ ┌─────────┐ ────────────────▶ ┌──────────┐
  │ Closed   │               │ Tripped │                   │ HalfOpen │
  │ (normal) │ ◀───────────── │  (503)   │ ◀─────────────── │ (1 probe) │
  └──────────┘  probe ok   └─────────┘  probe fails    └──────────┘
Closed — no in-process failures; requests pass through
Tripped — threshold consecutive 5xx responses; requests rejected
HalfOpen — one probe request allowed through after cooldown
Probe cancelled mid-flight? The probe slot is automatically freed after another half_open_timeout so the next request can claim it.

The Bypass Header & Deadlock Prevention

The health-check cron needs to re-probe tripped services to confirm recovery. Without a bypass, tripped circuits create a deadlock:

circuit trippedhealth check blockedcircuit never resetsdeadlock forever

The bypass header (default: x-health-check-bypass) lets the cron through. In production, always configure bypass_header_secret so that knowing the header name alone is insufficient to bypass circuit protection. The secret is compared in constant time using subtle::ConstantTimeEq.

Module Structure

Module Purpose
configBuilder-pattern configuration for all circuit breaker parameters; emits tracing::warn! if the default HMAC secret is in use
stateState types: CircuitStatus (Closed/Open/Tripped), AlgorithmCircuitState, CircuitBreakerFile on-disk format, RuntimeServiceState (in-process, HalfOpen)
integrityHMAC-SHA256 helpers: compute_hmac() and verify_file_hmac(); constant-time comparison via subtle::ConstantTimeEq
writerAtomic state-file writer: probes observations, accumulates consecutive failure counts, signs the algorithms block with HMAC, writes to .tmp then renames
loaderReads and HMAC-verifies the state file; on mismatch clears all in-memory state (fail-open) and emits tracing::warn!
middlewareAxum / Tower middleware: circuit_breaker_layer() wraps a Router; tracks per-service in-process 5xx counts, trips/half-opens the runtime circuit, checks bypass header

1 — Health-Check Producer (Writes State File)

The health-check process observes each service and writes the signed state file. It loads previous state to accumulate consecutive failure counts correctly across runs.

Rust — Producer
use hmac_circuit_breaker::{
    state::CircuitBreakerFile,
    writer::{write_state, ServiceObservation},
};
use std::collections::BTreeMap;
use std::path::Path;

fn run_health_checks() -> Result<(), Box<dyn std::error::Error>> {
    let path = Path::new("/var/run/myapp/circuit_breaker.json");
    let secret = std::env::var("HMAC_SECRET").expect("HMAC_SECRET must be set");

    // Load previous state to accumulate consecutive failure counts.
    // Safe on first run โ€” returns empty map if file doesn't exist yet.
    let previous: BTreeMap<_, _> = std::fs::read_to_string(path)
        .ok()
        .and_then(|s| serde_json::from_str::<CircuitBreakerFile>(&s).ok())
        .map(|f| f.algorithms)
        .unwrap_or_default();

    let observations = vec![
        ServiceObservation { name: "payments".into(), passed: true,  error: None },
        ServiceObservation { name: "auth".into(),     passed: false,
                             error: Some("connection refused".into()) },
    ];

    write_state(path, &observations, &previous, 3, &secret)?;
    Ok(())
}

2 — API Server Consumer (Reads State, Checks Circuits)

Rust — Consumer
use hmac_circuit_breaker::{CircuitBreakerConfig, CircuitBreakerHandle};
use std::path::PathBuf;
use std::time::Duration;

#[tokio::main]
async fn main() {
    let config = CircuitBreakerConfig::builder()
        .state_file(PathBuf::from("/var/run/myapp/circuit_breaker.json"))
        .secret(std::env::var("HMAC_SECRET").expect("HMAC_SECRET must be set"))
        .threshold(3)
        .reload_interval(Duration::from_secs(60))
        .build();

    let handle = CircuitBreakerHandle::new(config);
    handle.load().await;    // initial load at startup
    let _reload = handle.spawn_reload();  // background refresh; returns JoinHandle

    if handle.is_tripped("auth").await { /* return 503 */ }

    if let Some(state) = handle.get("payments").await {
        println!("{}: {} failures", state.status, state.consecutive_failures);
    }

    // Full snapshot of all tracked services (for health/status endpoints)
    let all = handle.snapshot().await;
    for (name, state) in &all {
        println!("{name}: {}", state.status);
    }
}

3 — Axum Middleware (features = ["axum"])

Rust — Axum
use axum::{Router, routing::post};
use hmac_circuit_breaker::{CircuitBreakerConfig, CircuitBreakerHandle};
use hmac_circuit_breaker::middleware::circuit_breaker_layer;

async fn my_handler() -> &'static str { "ok" }

#[tokio::main]
async fn main() {
    let config = CircuitBreakerConfig::builder()
        .state_file("/var/run/myapp/circuit_breaker.json".into())
        .secret(std::env::var("HMAC_SECRET").expect("HMAC_SECRET must be set"))
        // Require a secret value on the bypass header (production best practice)
        .bypass_header_secret(Some(
            std::env::var("BYPASS_SECRET").expect("BYPASS_SECRET must be set")
        ))
        .build();

    let handle = CircuitBreakerHandle::new(config.clone());
    handle.load().await;
    let _reload = handle.spawn_reload();

    // Map "/encrypt/{service}" โ†’ service name for circuit lookup
    let extractor = |path: &str| -> Option<String> {
        let parts: Vec<&str> = path.trim_start_matches('/').splitn(3, '/').collect();
        if parts.first() == Some(&"encrypt") {
            parts.get(1).map(|s| s.to_string())
        } else {
            None
        }
    };

    let app = Router::new()
        .route("/encrypt/:service", post(my_handler))
        .layer(circuit_breaker_layer(
            handle.shared_state(),
            handle.runtime_state(),
            config,
            extractor,
        ));

    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
    axum::serve(listener, app).await.unwrap();
}

4 — Graceful Shutdown with JoinHandle

spawn_reload() returns a tokio::task::JoinHandle<()> so the background reload task can be cleanly stopped during shutdown. Drop the handle to ignore it — the task continues until the runtime shuts down.

Rust — Graceful Shutdown
let handle = CircuitBreakerHandle::new(config);
let reload_task = handle.spawn_reload();

// ... run your application ...

// On shutdown signal:
reload_task.abort();
let _ = reload_task.await; // JoinError::is_cancelled() is expected

Running Examples

Shell
# Complete producer + consumer round-trip
cargo run --example basic

# axum middleware demo
cargo run --example with_axum --features axum

Running Tests

Shell
cargo test
cargo test --features axum

Configuration Reference

Only state_file and secret need to be set in production. All other fields have safe defaults.

Field Default Description
state_file "circuit_breaker.json" Path to the on-disk JSON state file written by the health-check producer
secret "circuit-breaker-integrity" Override in production. HMAC signing secret. A tracing::warn! is emitted if the default is still in use.
threshold 3 Consecutive failures before a circuit trips (applies to both file-based and in-process detection)
reload_interval 60s How often the background task reloads the state file from disk and re-verifies its HMAC
bypass_header "x-health-check-bypass" Header that bypasses tripped circuits for the health-check cron. Set to None to disable bypass entirely.
bypass_header_secret None Required header value for bypass (constant-time compared). None = presence-only. Strongly recommended in production.
half_open_timeout 30s Cooldown before a half-open probe is allowed through after an in-process circuit trips
success_threshold 1 Consecutive successful probes in half-open state required to close the in-process circuit
strict_hmac false When true, reject state files without integrity_hash (unsigned/legacy files). Enable once all producers write signed files.

Minimal Production Config

Rust
// Minimal production config
let config = CircuitBreakerConfig::builder()
    .state_file("/var/run/myapp/circuit_breaker.json".into())
    .secret(std::env::var("HMAC_SECRET").expect("HMAC_SECRET must be set"))
    .bypass_header_secret(Some(
        std::env::var("BYPASS_SECRET").expect("BYPASS_SECRET must be set")
    ))
    .strict_hmac(true)  // reject unsigned files once all producers are upgraded
    .build();

// Disable bypass entirely (e.g. you handle recovery out-of-band)
let config = CircuitBreakerConfig::builder()
    .bypass_header(None::<&str>)
    .build();

Cargo Features

FeatureDefaultDescription
reload yes CircuitBreakerHandle::spawn_reload() background reload task; requires tokio. Returns JoinHandle<()> for graceful shutdown.
axum no circuit_breaker_layer() axum Tower middleware. Implies reload.

The Problem with Persistent Circuit Breakers

A circuit breaker that only lives in memory resets on every restart — useful for transient faults but blind to persistent failures that survive reboots. Persisting circuit state to disk solves that, but introduces a new attack surface:

If an adversary can write to the state file, they can trip every circuit — a denial-of-service without ever touching the services themselves.

Why Fail-Open on HMAC Mismatch?

Response to tampered fileWhat the attacker achieves
Fail-closed — block all traffic Full self-DoS. Attacker writes a plausible-but-MAC-invalid file; every circuit trips immediately.
Fail-open — clear all circuits Temporary removal of protection for one reload cycle (~60 s). Worst case is baseline behaviour without a circuit breaker.
Fail-open means an attacker can at most remove protection for ~60 seconds — the tamper attempt is logged as a WARN. They cannot weaponise the circuit breaker.

Design Guarantees

These properties are explicitly enforced by the implementation:

GuaranteeHow it is implemented
HMAC is deterministic across languages Both writer and verifier round-trip through serde_json::Value (BTreeMap-backed), producing alphabetically sorted keys at every nesting level — not just the outer map
State file writes are atomic Writer outputs to {path}.json.tmp (same directory) then calls rename(2) — readers never observe a partial write; cross-filesystem rename (EXDEV) returns WriteError::AtomicRename
HMAC comparison is constant-time Uses subtle::ConstantTimeEq from the audited RustCrypto subtle crate; no early exit, no timing side-channel
Tampered file fails open, not closed HMAC mismatch clears all in-memory state; no circuit is left tripped from a forged file
Unknown services are open by default is_tripped() returns false for any name not in the state file; new services are never accidentally blocked
First-run safe Missing state file is silently ignored; all circuits begin closed
Legacy files emit a warning Files without integrity_hash log at WARN; enable strict_hmac to reject them entirely
In-memory reads are lock-free contention-minimal State is Arc<RwLock<HashMap>> — concurrent reads never block each other
Default secret triggers a warning CircuitBreakerConfigBuilder::build() emits tracing::warn! if the secret is still the built-in default
Bypass header requires a secret in production Configure bypass_header_secret so that knowing the header name alone is not sufficient to bypass circuit protection

Security Considerations

HMAC-SHA256 Constant-Time Compare Atomic Writes Fail-Open cargo audit CI Pinned SHA Actions

HMAC Secret Recommendations

EnvironmentRecommended source
Development Hard-coded fallback (convenient, not secure — tracing::warn! emitted automatically)
Production Database password / service account secret — any credential that the file-writing process cannot read independently
Multi-tenant Dedicated secret per tenant in HashiCorp Vault / AWS Secrets Manager
The secret does not need to be high-entropy — it just needs to be unavailable to the process that might tamper with the file.

State File Format

The on-disk JSON format is stable from v0.1 onward. A breaking format change requires a major version bump so existing producers and consumers continue to interoperate.

JSON
{
  "updated_at": "2026-02-27T15:22:41Z",
  "threshold": 3,
  "integrity_hash": "7b1def6802fabaed287c41786162e5648f47010d…",
  "algorithms": {
    "auth": {
      "consecutive_failures": 3,
      "reason": "connection refused: 127.0.0.1:5432",
      "since": "2026-02-27T14:00:00Z",
      "status": "tripped"
    },
    "payments": {
      "consecutive_failures": 0,
      "status": "closed"
    }
  }
}
The service map key is called "algorithms" — a naming convention from the original use case (per-algorithm circuit protection in a cryptographic API). Any string key works; "algorithms" is just the JSON field name. integrity_hash is absent in legacy files; when present, it is HMAC-SHA256 over the compact canonical JSON of the algorithms block, hex-encoded (lowercase, 64 characters).

Canonical JSON: The Cross-Language Contract

The HMAC input is the compact, alphabetically sorted JSON of the algorithms map. This is the contract any third-party producer must follow:

Sort all JSON object keys alphabetically at every nesting level.
Compact serialisation (no whitespace).
UTF-8 encoding.
HMAC-SHA256 with the shared secret.
Hex-encode the output (lowercase).

Example canonical form for the HMAC input:

JSON — Canonical HMAC Input
{"auth":{"consecutive_failures":1,"reason":"timeout","status":"open"},"db":{"consecutive_failures":0,"status":"closed"}}
Note: Field order within each service entry is alphabetical — consecutive_failures before status. The outer map is also alphabetical by service name. This matches Rust’s serde_json default behaviour (BTreeMap-backed objects).

Cross-Language Producers

Any language can produce a compatible state file as long as the HMAC is computed over the canonically serialised algorithms block.

Python (sort_keys=True required)
Python
import json, hmac, hashlib
from datetime import datetime, timezone

def write_circuit_state(path: str, algorithms: dict, secret: str) -> None:
    """Write a HMAC-signed circuit breaker state file."""
    # CRITICAL: sort_keys=True is required โ€” Rust serde_json sorts keys alphabetically.
    # Using sort_keys=False produces a different byte sequence and the HMAC will fail.
    algorithms_json = json.dumps(algorithms, separators=(',', ':'), sort_keys=True)

    mac = hmac.new(secret.encode(), algorithms_json.encode(), hashlib.sha256)
    integrity_hash = mac.hexdigest()

    state = {
        "updated_at": datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
        "threshold": 3,
        "integrity_hash": integrity_hash,
        "algorithms": dict(sorted(algorithms.items())),  # outer map also sorted
    }
    with open(path, "w") as f:
        json.dump(state, f, indent=2)

# Example usage
algorithms = {
    "payments": {"status": "closed", "consecutive_failures": 0},
    "auth":     {"status": "tripped", "consecutive_failures": 3,
                 "since": "2026-02-27T14:00:00Z",
                 "reason": "connection refused"},
}
write_circuit_state("/var/run/myapp/circuit_breaker.json", algorithms, "my-secret")
Shell (OpenSSL)
Bash
# Build canonical JSON manually (outer keys and inner keys must be alphabetically sorted)
ALGORITHMS_JSON='{"auth":{"consecutive_failures":3,"reason":"timeout","status":"tripped"},"payments":{"consecutive_failures":0,"status":"closed"}}'
SECRET="my-secret"

HASH=$(echo -n "$ALGORITHMS_JSON" | openssl dgst -sha256 -hmac "$SECRET" -hex | awk '{print $2}')
echo "integrity_hash: $HASH"

Versioning & Stability

This crate follows Semantic Versioning. Releases in the 0.x series may include breaking API changes; every breaking change is called out explicitly in the changelog. The on-disk JSON state file format is considered stable from 0.1 onward.

v0.3.0 Changelog

ChangeDetails
spawn_reload() returns JoinHandle<()> Enables graceful shutdown. Drop the handle to preserve the previous fire-and-forget behaviour.
bypass_header_secret config field Optional secret value required on the bypass header (constant-time compared). None preserves the previous presence-only behaviour.
strict_hmac config field When true, unsigned legacy files are rejected (fail-open) instead of accepted with a warning. Default false preserves backward compatibility.
tracing::warn! on default secret CircuitBreakerConfigBuilder::build() warns when the built-in placeholder secret is still in use.
subtle::ConstantTimeEq for HMAC comparison Replaced the inline XOR-fold with the audited subtle crate. No early exit, no timing oracle.
WriteError::AtomicRename Cross-filesystem rename(2) failures (EXDEV) now surface with a descriptive error variant.
cargo audit in CI Dependency CVE scanning on every push.
GitHub Actions pinned to full commit SHAs Supply-chain hardening: all Actions are pinned to exact commit hashes.

Dependencies

CrateVersionPurpose
hmac0.12HMAC construction over SHA-256
sha20.10SHA-256 hash implementation
subtle2Audited constant-time comparison (ConstantTimeEq)
serde1Serialisation / deserialisation framework
serde_json1JSON serialisation; BTreeMap-backed for deterministic key order
thiserror1Error type derivation
tracing0.1Structured logging (warn on tamper / default secret)
chrono0.4RFC 3339 timestamps in state file
tokio1Async runtime; optional (reload feature)
axum0.7Web framework integration; optional (axum feature)
tower0.4Middleware layer abstraction; optional (axum feature)