Skip to main content

Limit Strategies

RLaaS is powered by the open‑source GoRL rate‑limiting library, so the behaviour you see in production is the same code that is benchmarked and battle‑tested in the OSS world.


Choosing the Right Strategy

RLaaS ships with four algorithms. All share the same two parameters:

ParameterTypeRequiredMeaning
limitintMaximum requests (or tokens) in a window
windowstring (30s, 1m, 5m…)Time span that defines the rate

If the response does not contain {"allowed": true} you must treat the request as rejected and back‑off. RLaaS returns HTTP 429 with plain text Rate limit exceeded for all denials.


Strategy Cheatsheet

Strategy (JSON)How it worksWhen to useNotes
fixedCounts hits in a discrete, wall‑clock window. Counter resets every window.Low‑traffic admin APIs, cron callbacksFastest algorithm (single counter)
slidingKeeps current + previous window counters and interpolates. Smooths spikes.Auth/login, GraphQL‑gateway limitsAvoids thundering herd at window edge
tokenBucket refills at a constant rate: limit / window tokens per second. Bursts are allowed up to limit.Mobile / IoT clients with bursty trafficEach request consumes 1 token
leakyWater leaks out steadily at limit / window. New requests fill the bucket.Background job runners, message queuesKeeps outbound rate constant
{
"endpoint": "/v1/pay",
"strategy": "token_bucket",
"key_by": "api_key",
"limit": 100,
"window": "10s",
"fail_open": false
}

Implementation Highlights

  • Fixed Window – one Redis key per key_by; TTL = window.
  • Sliding Window – stores ts, prev and curr counters. Approximates true sliding window with O(1) reads.
  • Token Bucket – keeps tokens and last_refill; window/limit determines nanoseconds per token.
  • Leaky Bucket – keeps water_level and last_leak; leaks are calculated on each hit.
  • Common API – All algorithms expose Allow(key string) (bool, error) in GoRL. RLaaS orchestrates them per tenant.
  • Persistence – All counters/state live in Redis; ephemeral process restarts do not reset limits (except expired keys).

Failure Behaviour: fail_open

What it is: A safety valve controlling behaviour when the limiter cannot evaluate (Redis outage, network partition, serialization bug, timeout) and an internal error surfaces.

ValueBehaviour on internal errorTrade‑off
false (default)Reject request (HTTP 500 Internal error). Rate limits remain strict.Protects against abuse / billing explosions but impacts availability.
trueAllow request as if it were under limit. Error is logged/metric'd.Preserves availability; risk of temporary unlimited usage.

Guidelines:

  • Use fail_open: true on latency‑sensitive, user‑facing endpoints where brief over‑allowance is preferable to outages (e.g. product listing, read APIs, login burst after deploy).
  • Keep fail_open: false for cost, fraud, quota or mutation‑heavy endpoints (payments, write APIs, SMS/email senders) where uncontrolled bursts are expensive or abusable.
  • Always pair fail_open: true with alerting: you need to know you are degraded. Emit a metric counter rate_limiter_fail_open_total and alert if >0 for N minutes.
  • Consider circuit breaking: if consecutive internal errors exceed a threshold, switch the rule to fail_open automatically and raise a high‑severity alert.

Rule of thumb: Prioritise availability for idempotent, cacheable reads; prioritise control for state‑changing or financially sensitive operations.


Benchmarks (AMD Ryzen 7 4800H)

AlgorithmSingle Key
(ns/op, B/op, allocs)
Multi Key
(ns/op, B/op, allocs)
Fixed Window89.2, 24, 1202.5, 30, 2
Leaky Bucket333.8, 112, 4506.4, 126, 5
Sliding Window260.5, 72, 3444.0, 86, 4
Token Bucket339.6, 128, 4504.4, 126, 5

Fixed Window is the fastest, but Sliding often offers the best latency vs smoothness trade‑off. Token/Leaky come with predictable throughput at the cost of extra state management.


Best‑Practice Flow

  1. Pick a strategy using the table above.
  2. Define limit & window based on the worst‑case traffic you want to allow.
  3. Decide fail_open per endpoint’s risk profile (availability vs abuse cost).
  4. Test locally with the code snippets in Get Started (200 → allowed, 429 → retry).
  5. Monitor rule hits, denials and fail‑open events in the RLaaS dashboard; adjust limit, swap strategies, or harden infra as needed.