In web services, rate limiters are used to control the client traffic by limiting the number of requests allowed within a certain time window. Some of the benefits of using rate limiters are:

Prevent DDoS.
Reduce costs.
Reduce chances of being overloaded.

Interview

1. Client side or server side rate limiter?
	1. Tech stack or ecosystem
	2. Bandwidth or resources
2. On what basis is this rate limiter working on? Client ID or IP?
3. Scale or audience
4. Separate service or integrated service

Requirements:

Accurate
Fast
Efficient
Distributed
Exception
High fault tolerate, esp. when distributed

Algorithms

Token Bucket

Imagine there is a bucket and a refiller puts some credits in the bucket periodically. Any incoming request will need one credit to be executed, if the bucket is empty, the request will not be honored.

Parameters: bucket size and refiller rate
Different buckets for different API endpoints
Easy to implement and allows traffic burst but parameters are difficult to tune

Leaky Bucket

Requests are stored in queue and being processed at regular intervals. Excessive requests are dropped.

Parameters: queue size and processing rate
Efficient for stable processing rate but cannot handle a traffic burst

Fixed Window Count

Each time window has a counter and one request increases the counter by 1.

Efficient
Spike traffic might leads to more requests than allowed. For example, a 100 requests per minute rate limit could allow 200 requests from 00:59 to 01:01 as a result.

Sliding Window Log

Request timestamp is logged and outdated logs are evicted.

Every accurate but not so memory efficient

Sliding Window Count

~~[[Pasted image 20220528152304.png]]~~❌

Previous requests are discounted in the rolling time window.

Not 100% accurate but more memory efficient than ~~[[]]~~❌

Interview

~~[[Pasted image 20220528152635.png]]~~❌

A Redis in memory database can be used to store the counters.

Interview

Configuration

It can be written in some YAML file stored on disk.

Dealing with Excessive Requests

Ignore them with some HTTP response
Re-enqueueing them if servers failed

Distributed Environment

Race Condition

Locks but they can be slow
Lua script ¹ and sorted sets data structure in Redis ²

Synchronization

When multiple rate limiter are used for distributed load-balancing, you can either use sticky request — routing requests to the same rate limiter from the same user, or use central database like Redis.

Optimization

Route requests to nearest servers for low latency;
Data synchronization with an eventual consistency model;

Monitoring

We want to make sure our rate limiters are working:

Algorithm is effective (handling burst e.g.)
Rules are effective (no valid requests dropped)

Interview

Rate limits can be used in many layers of the application
Design products to reduce the need for rate limits
- Caching
- Load-balancing
- Exceptions
- Retry

References

Footnotes

0-rate-limiters.md · GitHub

Better rate limiting with Redis sorted sets | Building ClassDojo

Rate Limiter

Interview

Algorithms

Token Bucket

Leaky Bucket

Fixed Window Count

Sliding Window Log

Sliding Window Count

Interview

Interview

Configuration

Dealing with Excessive Requests

Distributed Environment

Race Condition

Synchronization

Optimization

Monitoring

Interview