Thursday, May 2, 2024
HomeBackendRate Limiting

Rate Limiting

One of the challenges we had was to figure out how to reduce the prospect of abuse by rate limiting requests to the game API. One way to do it is to simply limit burst requests to something like no more than five API calls in a second. With a simple system, what would stop someone from simply doing four requests a second, 86,400 seconds per day?

A two-tiered system helps to further avoid longer term abuse by rate limiting to say 1,000 requests per hour total AND rate limiting burst requests to five per second. Implementing this system requires storing each and every request in a database and polling this database with every single request to the API.

Imagine doing that across thousands, or even millions of players. First, that adds up to millions of unnecessary database calls that would horribly bring any decent server to a crawl. Second, the sheer amount of wasted storage just to keep track of API requests would be a total waste of the available IOPS. Yet, this is exactly how rate limiting is done for many applications.

There has to be a better way. We explored using some advanced math to help us with this. Perhaps we could use some sort of a summation function. I queried ChatGPT for its ideas, and this is what it suggested:

  1. Fixed Window Counter:
    • In this approach, time is divided into fixed windows (e.g., every minute or hour). A counter for each client (often identified by IP or user ID) increments with each request and resets at the end of the window.
    • Pros: Simple to implement.
    • Cons: Can be abused at the boundary of time windows.
  2. Sliding Window Log:
    • The server keeps a log of all incoming requests for each client. When a request comes in, the server checks how many requests the client has made in the past time window (e.g., the past hour).
    • Pros: Smoother than the fixed window and avoids boundary conditions.
    • Cons: Requires more storage to keep the logs.
  3. Sliding Window Counter:
    • It’s a hybrid of the fixed and sliding window methods. Use a single counter that covers two time windows, ensuring that the rate limit is enforced smoothly over the boundary between the two windows.
    • Pros: Smoother than fixed windows without requiring as much storage as the log-based approach.
    • Cons: More complex to implement than the fixed window.
  4. Token Bucket:
    • Clients start with a full “bucket” of tokens. Each request removes a token from the bucket. Tokens are added back to the bucket at a fixed rate. If the bucket is empty, the request is denied.
    • Pros: Allows for “bursty” behavior. If a user hasn’t made requests for a while, they can make a burst of requests up to the bucket size.
    • Cons: Can be a bit more complicated to implement compared to fixed windows.
  5. Leaky Bucket:
    • Similar to the token bucket but inverted. Requests fill the bucket, and it leaks at a constant rate. If the bucket overflows, the request is denied.
    • Pros: Good for smoothing out bursts.
    • Cons: Like the token bucket, can be a bit more complicated to implement compared to fixed windows.
  6. Rate Limiting using Integral (or Summation) Functions:
    • This is an advanced method, where the rate limit is enforced based on an integral function of the user’s request rate. It’s less common in traditional web apps but can be useful in specific scenarios where the rate limit should adapt based on the user’s behavior or other factors.
  7. Distributed Rate Limiting:
    • If you have a distributed system or microservices architecture, you might need to implement rate limiting across multiple services. This often requires a central store (like Redis) to keep track of counts.

All of those have issues. Some still require gigantic amounts of storage. Others have high overhead with respect to IOPS. Some, like the sliding window logs, are easily abused once someone figures out the algorithm. The leaky bucket idea made us think about using a calculated number that uses the number of requests divided by a time period. We wanted to keep the rate limiting information restricted to a single line in the database – IP address, some sort of score / calculated number, and whether that IP address was banned due to previous abuse. The last thing I wanted was to do complex table queries before the client even got to the meat of the API.

We played around with the numbers in Excel and realized that we would need to store the timestamp for the last API request. Time is obviously a variable here and it would need to be tracked. Putting the ideas together around using a calculated number, a summation function (of sorts), and tracking time plus the hint from ChatGPT about leaky buckets made us think about using a decay function. Here’s what ChatGPT thought of that:

Using a decay function for rate limiting is a clever approach that allows you to apply rate limiting while using minimal storage. Instead of tracking individual request timestamps, you track two values: a score (or counter) and the timestamp of the last request. With each new request, the score is increased but also decays over time. If the score exceeds a threshold, the request is denied.

Here’s the actual function:

score = score * exp(-decay_rate * delta)

Only a single line in the database is needed. There are no complex database queries. The query is super fast and minimizes disk IOPS. It’s also resistant to abuse since you can’t time a window. In fact, combining this decay function with an absolute request limit also prevents long term abuse spread out over the course of a day.

We combined this rate limiting step with a strike system. If the client abuses the API and gets rate limited, they get a strike. Do it enough times and the IP address is banned. Hopefully this proactive approach will help prevent using bots and spamming the game API, creating an equitable and enjoyable experience for all players.

RELATED ARTICLES

Most Popular

Recent Comments