Most developers know they should add rate limiting to their APIs. Most developers also ship without it, because nothing bad has happened yet. Then one morning the database is on fire and the logs are full of the same IP hitting the login endpoint 3 000 times in a minute.
Rate limiting is cheap to add early and expensive to retrofit under pressure. Here is how to think about it properly.
What You're Actually Protecting Against
Rate limiting solves three distinct problems that often get lumped together:
- Brute-force attacks. Someone trying every password combination against your login form. A limit of 5 attempts per minute per IP makes this economically unviable.
- Scraping. Bots harvesting your content or pricing data faster than any human could read it. Limits per route protect your data without blocking real users.
- Accidental overload. A misconfigured client in production hammering your API in a tight loop. Limits here protect your infrastructure from your own mistakes.
Each case needs a slightly different limit — login endpoints are much stricter than a public search API, for example.
The Sliding Window vs Token Bucket Debate
Two algorithms dominate production rate limiting. The fixed window approach (reset the counter every 60 seconds) is simple but creates edge cases — a burst of requests right at the window boundary can effectively double your allowed rate. The sliding window is more accurate but requires storing a timestamp log per user. The token bucket smooths out bursts naturally and is what most serious systems use.
For most web apps, a sliding window stored in Redis is the right call. It's accurate enough, easy to reason about, and Redis makes it fast.
In Laravel, Specifically
Laravel's built-in throttle middleware is a fixed window implementation, which is fine for most routes. For login endpoints specifically, consider the RateLimiter facade with a key that combines IP and email — this prevents both IP rotation attacks and credential stuffing against a single account from multiple IPs.
Always return a 429 Too Many Requests response with a Retry-After header. Legitimate clients can back off gracefully. Bots usually don't, which actually helps you distinguish them.
One More Thing
Rate limiting at the application layer is not a substitute for rate limiting at the infrastructure layer. Both Cloudflare and nginx can absorb volumetric attacks before they reach your app. Treat application-level limits as the last line of defense, not the only one.