Rate limiting plays a major role in application security, especially when it is about defending web applications from malicious bot attacks, credential stuffing, brute force attacks and excessive API calls. Rate limiting security ensures that systems function properly without overwhelming them. It controls the number of requests a client or a specific IP address can send over a specified time period.
In this article, we will learn what rate limiting is and how WAF rate limiting acts as the first layer of defence to limit network traffic, avoid service disruption and keep access maintained for legitimate users.
What is Rate Limiting?
In simple words, we can say that a rate limit is a traffic control mechanism. It limits how many requests a client can send in a specific time frame. In case too many requests come in too quickly from a single IP address or an API key, then further requests may be blocked or delayed. This prevents attacks like DDoS attacks, and also helps restrict the requests that could cause the crash of web servers or overload the APIs. Security teams use tools like rule statements and rule builders to define criteria like request rate, URI path or other parameters to fine-tune protections.
It does not matter whether you are applying rate limiting at the application layer or managing it via AWS firewall manager. It is important to have a balance between blocking threats and maintaining access for valid users.
Types of Rate Limits
Below is an example of a web application vulnerable to Rate Limiting Attack:
// Express login route without rate limiting
app.post('/login', (req, res) => {
const { username, password } = req.body;
// Simulated login logic (no rate limiting!)
if (username === 'admin' && password === 'password123') {
res.send('Login successful');
} else {
res.status(401).send('Unauthorized');
}
});
Here, the application is vulnerable, and attackers can exploit this very easily because no rate-limiting methods are implemented.
Administrators may use various parameters or methods while configuring a rate limit. But there are mainly three approaches to set up a rate limit that can be used by any organisation:
User Rate Limits
This is the most common method of setting up the rate limit. In this, the user’s requests are being counted or tracked through his/her IP address or API key, and when it exceeds the defined threshold, the requests start to be blocked.
Use Case: A maximum of five unsuccessful login attempts per minute per IP address is implemented by an e-commerce platform’s login page. By doing this, brute force attacks, in which attackers try to guess passwords by quickly inputting credentials, are avoided. User accounts are protected from compromise by temporarily blocking further requests after the threshold is reached.
Geographic Rate Limits
For instance, developers can predict that the users within a certain region of interest will be inactive during the interval from midnight to 9:00 am and establish a lower rate limit for this period. This measures effectively counter traffic that appears suspicious and adds an extra layer of protection against an attack.
Use Case: One UK-based fintech company noted that they see almost no legitimate traffic from their users in the UK from midnight to 6 am GMT. It sets a lower rate limit for traffic from non-UK users, such as those from areas that aren’t in the customer base. This can prevent automated scraping or credential stuffing attacks launched in off-hours, when the security team is not engaged in real-time monitoring of traffic.
Server Rate Limits
Developers can implement rate limits on the server level if they specify a particular server to service parts of an application. This method is more flexible in that the developers have the ability to raise the rate limit on frequently used servers while lowering the traffic limit on less-used servers.
Use Case: A travel booking API restricts calls to the /book endpoint to 10 requests per API key per minute and to the /search endpoint to 100 requests per minute. This makes it impossible for users to spam or automate orders, but at the same time allows regular availability checks. It also protects from API abuse, which may flood backend services or unbalance data views.
How Web Application Rate Limiting Security Protects Against Common Attacks?
These days, Web Applications and API handle a high volume of incoming requests, some of which are not legitimate. Here, the rate limit acts as a powerful mechanism which allows you to control service requests coming from individual users, API keys, or any specific IP address. By implementing rate limit rules according to the request type, frequency, or URI Path, organisations can effectively reduce risk and ensure the integrity of the web servers and application security posture.
Case Study: Stopping API Abuse in a SaaS Platform
A freemium API SaaS business was having performance problems caused by API abuse—people were scraping data using automated scripts. By enabling API rate limiting using Cloudflare rate limiting rules, they limited API calls to 200 per hour for every API key.
Results:
- Improved API latency by 60%
- Cut abusive traffic by more than 85%
- No impact on valid API users’ performance
Below are some of the attacks that can be prevented by using rate-limiting solutions:
Credential Stuffing
Attackers use tools like Burp’s Intruder to automate API calls to repeatedly guess valid usernames or passwords for login panels and other authentication mechanisms. If the rate limit is not enabled or configured properly, then brute force attacks can easily compromise accounts. Organisations can use a rate-based rule to limit requests from a source IP address or based on the login parameters to block subsequent requests once the defined threshold is reached.
Denial Of Service (DoS) / Distributed Denial of Service (DDoS) Attacks
In this attack, attackers flood or overwhelm the web applications by sending a massive number of requests continuously from multiple accounts or by using botnets. But organisations can identify and filter suspicious network traffic by implementing rate-limiting algorithms at the application layer. One can also use services like AWS WAF, AWS Shield Advanced, or AWS Firewall Manager for their AWS environment.
Web Scrapping
These days, web scraping is very common, in which malicious bots crawl websites to extract sensitive data in large amounts. This data can be either personal data, pricing information or proprietary content. Organisations can use the rate limit with rule groups and text transformation to prevent excessive crawling from fake good bots.
API Abuse
Public-facing APIs are exposed to being abused or overwhelmed by attackers attempting to overload endpoints or take advantage of business logic. With API rate limiting, organisations can specify the maximum number of requests allowed from a client within a specified timeframe. This way, they can prevent the abuse of the APIs without impacting the functioning of the web applications for valid use cases.
How WAFs Implement Rate Limiting?
Web Application Firewalls (WAFs) are important for implementing rate-limiting security by investigating and filtering the incoming requests to the web applications. Using WAFs, applications can be prevented from being abused. In WAFs, the rate limit can be achieved by defining rules to rate limit requests from the source IP address once the threshold is reached. These rules can be applied through the rule builder or rule groups present within the WAF’s configuration.
WAF monitors the entire web traffic and keeps track of it according to the URI path, request rate or IP address. For example, if a WAF has a rule defined to only allow 100 requests per minute from a single IP. Then, all further requests will be throttled or rate-limited, protecting the application from brute force attacks and other attacks trying to flood the servers.
It is the responsibility of the administrator to properly configure the WAF policies and apply different thresholds for individual users or IP addresses. This is to ensure that the legitimate users are not disrupted. Also, WAFs provide text transformation features which normalise the inputs and make the detection of suspicious activity easier. This makes bypassing the WAF filters difficult for the attackers.
There are various automated actions that can be taken by WAF when the maximum number of requests allowed is reached in the given timeframe. Some of the major actions are:
- Deny the new request completely
- Throttle the response
- Count the request (for monitoring in count mode)
- Send logs to the SIEM for real-time monitoring and analysis of security events
There are also advanced solutions available, like Fastly Next-Gen WAF and Azure Front Door, which enhance the protection by using machine learning.
Configuring Rate Limiting Rules
Configuring rate-limiting rules is not like giving any random number as the static threshold. It is important to understand the normal traffic pattern of the application first. After that, specific rules need to be created to align with the real usage and prevent or filter the abusive requests. The objective of the rate limiting is to specify the number of requests that can be made to the target by an IP address, API key or a session in a time period.
Rule builders are used by administrators to create rate-based rules customised for their environment. These policies may reduce login attempts to 10 per minute per IP to protect against brute force attacks, or limit data-fetching APIs to 200 service requests per minute to prevent web scraping. Policies can affect all traffic globally or certain paths by using parameters such as URI path to only include relevant traffic in the limit.
Whenever a rule is triggered, the WAF will automatically take the pre-defined action, such as blocking or limiting the requests. Some WAFs offer more flexible responses, such as issuing CAPTCHA challenges, delaying responses or staffing the session for monitoring purposes. These responses can reduce the risk of blocking good bots and legitimate users during traffic spikes.
The more essential and important part of this process is tuning. If the limitations set are too restrictive, then it may also impact user experiences, and if too weak, then they may not be able to stop abuses. This particularly applies to APIs, where api rate limiting must consider burst traffic and diverse usage patterns from different users. Traffic patterns over time must be examined by administrators and rules updated based on real usage and threat intelligence.
From a performance standpoint, too much logging or too fine-grained rules may bog down WAF resources and web server latency. In order to reduce impact, administrators must use rate limiting selectively, resist deep inspection for all traffic, and implement request aggregation techniques for streamlined processing.
Rate Limiting as Part of a Broader Security Posture
Rate limiting is an effective strategy, but it is only effective to counter a limited class of attacks, mainly abuse due to high or malicious request rates. It can’t protect against complex injection attacks, session hijacking, or application logic errors on its own. So, rate limiting needs to be part of an overall application security strategy.
To build a strong security posture, multiple layers of defence are required. These layers include WAF rulesets for known vulnerabilities, access control policies, behavioural analysis to detect anomalies and continuous traffic monitoring. For example, rate limit may be able to block repeated login attempts, but other WAF rules should also inspect request payloads, URI paths and headers to detect evasion attempts.
In addition, regular logging and analysis of rate-limited events give helpful indications of attack patterns and usage anomalies. When integrated with a Security Information and Event Management (SIEM) system, these help teams correlate rate-limiting triggers with other network security events.
Cyphere enables companies to enhance their overall security posture by integrating WAF rate limiting with vulnerability management, incident response workflows, and threat intelligence. Our strategy ensures that rate limiting is not a standalone feature, but it’s one aspect of a well-coordinated defence plan that adjusts to your ecosystem and risk profile.
Enhance your Defence Against Abusive Traffic
Rate limiting is one of the best and most efficient ways to stop abusive traffic from flooding your web applications, especially when the number and complexity of cyberattacks continue to increase. Whether you’re facing malicious bot attacks, credential stuffing or excessive API calls, implementing intelligent rate limits can help you reduce risks without affecting legitimate users.
Web Application Firewalls (WAFs) work as an important tool in applying and implementing these rate-limiting controls. Modern WAFs give granular control over the number of requests permitted per IP address, session, or API key, all within a specified time frame, through rule-based filtering, dynamic traffic inspection, and real-time monitoring. This makes blocking, throttling or flagging incoming requests possible which exceed normal behaviour and protect your applications from denial of service (DoS) attacks.
At Cyphere, we encourage businesses to apply rate-limiting best practices as part of a comprehensive application security strategy. Our team ensures your defences are in line with the evolving threat landscape, whether you need help configuring custom rate-limiting rules, adjusting thresholds to prevent false positives, or connecting your WAF with SIEM systems for real-time event visibility.
Contact our team to improve your defences against traffic abuse and maintain reliable, secure application performance.
FAQs
What are Cloudflare rate-limiting rules, and how do they work?
Cloudflare rate limiting rules enable you to set up thresholds for incoming requests based on several factors, like IP address, URI path, or headers. When they are exceeded, Cloudflare can block, challenge, or log the events, preventing abuse and guaranteeing application reliability.
What is the rate-limiting response code for Cloudflare?
The rate-limiting response code for Cloudflare is HTTP 429 – Too Many Requests. This status code indicates a user has made too many requests to the server and has triggered the rate limit rule.



