We rate-limit at the CDN. Is that enough?

Edge rate limiting is one layer. If origin is directly reachable, edge limit is bypassed.

We limit per account. Is that enough?

Necessary but not sufficient. Account-creation rate limits and cross-account attack detection close the gap.

Does CAPTCHA solve credential stuffing?

Raises cost. Distributed solving services reduced effectiveness. Pair with progressive lockout and behavioural detection.

How do we rate-limit a GraphQL API?

Per-operation and per-field, not per-HTTP-request. Account for batched and aliased operations.

Are rate limits a compliance requirement?

PCI DSS 4.0 explicitly requires detection and prevention of credential stuffing. SOC 2, ISO 27001, HIPAA imply rate limiting through availability and access-control families.

API Rate Limit Bypass: Techniques, Real Cases, and Defenses

API Security · LearnAPI Penetration Testing Download PDF

TL;DR

Rate limiting is the API control that limits how often a caller can hit an endpoint, used to prevent scraping, brute force, credential stuffing, and denial of service. Rate-limit bypass is the family of techniques attackers use to send more traffic than the limit was supposed to allow. The bypasses fall into a small number of patterns: counting at the wrong layer, identifying the wrong attribute, allowing batched requests through, and leaking limits to attackers who then time their bursts under the threshold.

By Shubham Khandare, Delivery Manager, SecureLayer7Updated June 10, 2026

What are rate limits supposed to prevent?

Four use cases dominate.

Credential stuffing. Without rate limits on login, an attacker can test millions of stolen username and password pairs against the live API at speed.
Brute force. Without rate limits on OTP, password reset, or two-factor-code endpoints, attackers brute-force the code space.
Scraping. Without limits on read endpoints, attackers harvest the entire dataset for resale or analysis.
Resource exhaustion. Without limits on expensive operations, a single attacker can run up a cloud bill or take the service offline.

A correctly implemented rate limit makes each of these cost more than the attacker is willing to pay, or pushes the noise above what the detection layer would notice.

What are the most common rate-limit bypasses?

From engagements we ran in the last year, the bypasses we still find:

Per-IP limits and IP rotation. Limit counts per source IP. Attacker rotates IPs (residential proxy networks are cheap and available). Defeats per-IP limits at a low cost.
Per-account limits and account creation. Limit counts per account. Attacker creates many accounts. Defeats per-account limits without rotation.
Header forgery. Limit identifies the client by an X-Forwarded-For or X-Real-IP header that the application trusts. Attacker forges the header.
Edge versus origin counting. Limit lives at the edge (CDN, WAF) but the origin accepts requests that bypass the edge. Common when a direct origin endpoint is left exposed.
Batched and parallel requests. Limit counts HTTP requests. Attacker batches multiple operations per HTTP request (GraphQL) or fires many requests concurrently before the counter increments.
Path normalization. Limit identifies the endpoint by URL path. Attacker varies the path (trailing slash, case, encoded characters) to look like different endpoints.
Method variation. Limit applies to POST but the same endpoint accepts PUT or GET with the same effect.
Authentication variation. Limit identifies the caller by API key. Attacker rotates keys (free-tier accounts, leaked keys, shared keys).

What does a successful bypass enable?

Account takeover at scale. Bypassing login rate limits enables credential stuffing across the user database.
OTP brute force. Bypassing one-time-code rate limits enables brute-forcing of SMS or email codes (6-digit codes have one million possibilities; with no rate limit, that completes in minutes).
Mass scraping. Bypassing read-endpoint limits enables harvesting the entire database.
Denial of service. Bypassing expensive-operation limits enables resource exhaustion.
Sensitive business flow abuse. Bypassing flow limits on purchases, transfers, or account creation enables the OWASP API6:2023 category.

How do you implement rate limits that hold?

Layer the limits. Per-IP, per-account, per-API-key, and per-endpoint. The intersection makes a single bypass less useful.
Limit at the origin. Edge rate limiting is useful for defense in depth, not the only layer. The origin should enforce limits regardless of edge.
Verify the identity header. If you trust X-Forwarded-For, ensure it comes from a trusted proxy. Most CDNs strip and re-add the header so origin trust is safe; verify the configuration.
Track the right attribute. Login limits should track the username being tried, not just the source IP. Reset limits should track the account being reset, not just the requester.
Account for batching. For batchable APIs (GraphQL, multi-call endpoints), limit at the operation level, not the HTTP request level.
Lock progressively. First few attempts free, then progressive delay, then full lockout. Distinguishes legitimate user mistyping from attacker probing.
Detect anomalies. Even with limits, watch for traffic that looks like distributed bypass: many sources hitting one account, one source hitting many accounts at low rate.

How does SecureLayer7 test rate limits?

Every API engagement runs the bypass matrix on sensitive endpoints (login, password reset, OTP, signup, expensive read, expensive write).

IP rotation. Test with multiple source IPs (residential proxy, datacenter rotation).
Header forgery. Test with X-Forwarded-For, X-Real-IP, X-Original-IP variants.
Origin bypass. Map the origin endpoint directly, test whether limits enforced at the edge are enforced there too.
Batching. For GraphQL, test batched operations against the limit. For multi-call REST endpoints, test parallel requests against per-second limits.
Path and method variation. Test trailing-slash, case, percent-encoding, and method variants.
Identity variation. Test rotating API keys, accounts, and tokens against per-identity limits.

Deliverable maps findings to OWASP API4:2023 and API6:2023 with the specific rate-limit configuration change required.