Rate limiting middleware in ASP.NET Core is a concept commonly associated with an API, where it states (a rate limiter in a web API) is a mechanism used to control the number of requests that a client can make to an API within a certain time frame. It helps prevent API misuse or abuse by limiting the number of requests from a particular client or IP address, ensuring fair usage, and protecting server resources from being overwhelmed.
This concept has been integrated as a middleware with .NET 7 and later.
In the .Net environment, we have 4 types of algorithms that we can use in this regard:
1) Fixed window limiter
2) Sliding window limiter
3) Token bucket limiter
4) Concurrency limiter
We will explain each of them in a simplified manner.
1) Fixed window limiter
Fixed window limiter stipulates that there is a time limit for the window in which a number of requests can be executed, meaning that the number of requests must not exceed the permitted number in a specific, pre-determined time.
For example, 6 seconds for the window. You can execute a number of requests, for example 1 requests. If it exceeds this number, it must go to the waiting queue, for which we can also prepare an expected space, for example, 1 waiting requests. If the number of requests exceeds this number, an error message will be returned to the user because he has exceeded his permissible limit of requests, in most cases 429 in the case of an answer.
builder.Services.AddRateLimiter(option =>
{
option.AddFixedWindowLimiter("FixedWindowPolicy", opt =>
{
opt.PermitLimit = 1; // items allow processing
opt.Window = TimeSpan.FromSeconds(6); // time need to process PermitLimit items's
opt.QueueLimit = 1; // items allow be in Queue waiting
opt.QueueProcessingOrder = System.Threading.RateLimiting.QueueProcessingOrder.OldestFirst;//Order the items in Queue Waiting
}).RejectionStatusCode = 429; // This means that this request is not allowed at this time due to the large number of requests received now
});
2) Sliding Window Limiter
The sliding window limiter is similar to the fixed window limiter but offers more flexibility. It processes old requests while tracking ongoing ones, allowing it to handle new requests based on the status of previously submitted but incomplete requests.
For example, I have a window that lasts 6 seconds and receives 4 requests for this window. I can divide the window into three sliding windows (three segments per window). Here, this one window processes 4 requests. It executes what it can from it and passes it to the second window. And so, it goes for the second window as well. It executes and passes it to the window. The first is that if the sliding window has space to receive new requests, it pops up again in order to receive new requests while taking into account old requests that have not been completed.
builder.Services.AddRateLimiter(option =>
{
option.AddSlidingWindowLimiter("SlidWindowPolicy", opt =>
{
opt.PermitLimit = 4; // items allow processing
opt.Window = TimeSpan.FromSeconds(6); // time need to process PermitLimit items's
opt.QueueLimit = 5; // items allow be in Queue waiting
opt.QueueProcessingOrder = System.Threading.RateLimiting.QueueProcessingOrder.OldestFirst;//Order the items in Queue Waiting
opt.SegmentsPerWindow = 3;
}).RejectionStatusCode = 429; // This means that this request is not allowed at this time due to the large number of requests received now
});
3) Token Bucket Limiter
The token bucket limiter is a straightforward yet effective rate limiting technique. It uses a bucket to hold a set number of tokens, each representing permission to handle one request. When a request is made, a token is removed from the bucket. If the bucket is empty, the request is denied until more tokens become available.Tokens are replenished at a fixed rate over time, ensuring that the system can handle bursts of requests followed by idle periods.
For example, in a token bucket rate limiting algorithm, you might have a bucket that can hold up to 100 tokens. Each incoming request consumes one token, so the bucket's token count decreases with each request. To ensure that the system can handle bursts of requests, new tokens are added to the bucket at a specified interval (e.g., 3 tokens every 5 seconds), without exceeding the maximum bucket size of 100 tokens. This mechanism prevents request flooding while allowing occasional spikes in request rate.
builder.Services.AddRateLimiter(option =>
{
option.AddTokenBucketLimiter("TokenBucketPolicy", opt =>
{
opt.TokenLimit = 100; // Maximum number of tokens in the bucket
opt.QueueLimit = 10; // Number of requests that can wait in the queue
opt.QueueProcessingOrder = System.Threading.RateLimiting.QueueProcessingOrder.OldestFirst; // Order of processing queued requests
opt.ReplenishmentPeriod = TimeSpan.FromSeconds(5); // Time period for adding new tokens
opt.TokensPerPeriod = 3; // Number of tokens added each period
}).RejectionStatusCode = 429; // Status code returned when rate limit is exceeded
});
4) Concurrency limiter
Concurrency limiter limits the number of concurrent requests. Each request reduces the concurrency limit by one. When the order is completed, the limit is increased by one.
For example, we have a synchronization limit of 1 operation. Here, only one will be responsible for performing 1 simultaneous operation, meaning it must not exceed this limit.
builder.Services.AddRateLimiter(option =>
{
option.AddConcurrencyLimiter("ConcurrencyPolicy", opt =>
{
opt.PermitLimit = 1; // Maximum number of concurrent requests allowed
opt.QueueLimit = 10; // Number of requests that can wait in the queue
opt.QueueProcessingOrder = System.Threading.RateLimiting.QueueProcessingOrder.OldestFirst; // Order of processing queued requests
}).RejectionStatusCode = 429; // Status code returned when rate limit is exceeded
});
In the end, to connect it to the endpoint you can use two methods:
By using Controller endpoint:
[HttpGet("GetConcurrency")]
[EnableRateLimiting("ConcurrencyWindowPolicy")]
public IActionResult GetConcurrency()
{
return Ok("Concurrency");
}
[HttpGet("GetFixed")]
[EnableRateLimiting("FixedWindowPolicy")]
public IActionResult GetFixed()
{
return Ok("Fixed");
}
[DisableRateLimiting]
public ActionResult NoLimit()
{
return Ok("NoLimit");
}
The EnableRateLimiting attribute is used to apply a specified rate limiting policy to an action method. When this attribute is applied to a method, the rate limiting rules defined by the specified policy are enforced for that method. This means that any incoming requests to the action will be subject to the constraints of the policy, such as the maximum number of requests allowed in a given time period.
By using Minimal endpoint:
app.MapGet("/", () => Results.Ok("fixed")).RequireRateLimiting("FixedWindowPolicy");
The RequireRateLimiting method is similar to EnableRateLimiting, but it is used in the context of endpoint routing. It attaches a rate limiting policy directly to an endpoint defined in the routing configuration. When a request matches the route, the specified rate limiting policy is applied.
Importance of Rate Limiting for API Providers
1. Maintaining Service Reliability
Rate limiting helps ensure that an API remains responsive and available, even under heavy load.
2. Preventing Abuse
APIs are often targets for malicious activities, such as denial-of-service (DoS) attacks, scraping, and other forms of abuse. Rate limiting acts as a safeguard against these threats by limiting the rate at which requests can be made, thereby reducing the impact of automated attacks and ensuring that legitimate users can access the service.
3. Ensuring Fair Resource Allocation
Rate limiting ensures that resources are distributed fairly among all users. Without rate limiting, a few users might consume a disproportionate amount of server resources, leading to poor performance or service degradation for others.
In the end
In the end, these rate limiting algorithms can be fine-tuned using additional parameters, such as client IP addresses or user authorization levels, to impose more detailed limits tailored to specific user profiles. By implementing rate limiting with precise parameters, API providers can ensure optimal performance, fair resource allocation, and protection against abuse across all user interactions with the API.
Fine-Tuning Rate Limiting
Client IP Addresses: Apply rate limits based on the IP address of the client, allowing more granular control over request rates from different sources.
User Authorization Levels: Differentiate rate limits based on user roles or authorization levels, providing higher limits for premium users and lower limits for regular users.
Custom Rules: Create custom rate limiting rules to accommodate specific application needs, such as limiting requests based on geographic regions or specific API endpoints.
More resources
https://www.youtube.com/watch?v=bOfOo3Zsfx0
https://learn.microsoft.com/en-us/aspnet/core/performance/rate-limit?view=aspnetcore-8.0