Denial of Service: A Survival Guide
From Anonymous style SYN flooding to Application layer denial of service, denial of service is a subject that has been often confused with hacking by the grand public. While your data might not be stolen, the impact both on sales and reputation can be tremendous, especially for DOS that persists over long period of times. When it comes down to it, the amount of data that can be transferred from one point to another and the processing power is always limited to at one point or another. If you overload any of the two, requests get delayed until they are eventually dropped, creating a denial of service. Furthermore, there are many bottlenecks that can be exploited in order to slow down the whole system, sending a much smaller amount of packets or data compared to the impact a normal request would have on the server load. This is why the battle against denial of service always was and will always be asymmetrical. The resources spent to prevent it are much greater than the ones required by the attacker. Furthermore, as more systems and processes are added to reduce the risk of DOS, the attack surface is expanded. We will be taking a look into some of the measures that can be applied and what common pitfalls should be avoided in the process.
There are multiple considerations that must be taken into account when building a system that is going to be resistant to these types of attacks. First of all, the application type and context itself must be taken into consideration, as they usually influence the following variables that need to be taken into account:
- Typical peak usage
- Acceptable response time
- Acceptable down time
Some of the common bottlenecks that can be exploited are as follows:
- Total bandwidth available (in bytes/s)
- Disk space available
- Memory (RAM) available
- Processing power available
- Similar resources for backend infrastructure (database, other systems)
In order to have a robust application, the key is to address each of these issues and limits the effects of bottleneck as much as possible with a good architecture.
Before anything, you may want to make sure that the server application you are using is not vulnerable to known DOS vulnerabilities. Those can easily be identified using a vulnerability scanner. One of the most common DOS vulnerabilities is Slowloris, which affects Apache. The vulnerability consists of sending HTTP requests in fragments and waiting the maximum amount of time before sending the next chunk of HTTP request. By sending out these packets at long intervals, it is possible to excessively stall a single Apache thread and prevent it from serving other clients. When done concurrently, it is possible to use up all the connections that the server allows and thus prevent any legitimate client from connecting to it. In the case of Slowloris, the core of the issue lies in Apache's design not being asynchronous and as such the problem has yet to be completely fixed. Patches as well as IDS/IPS rules do exist to that mitigate the vulnerability either by banning IP addresses performing suspicious repeat requests or lowering the timeout value. In most cases, updating your web server as well as SSL/TLS libraries will fix known vulnerabilities.
You will then want to take steps to reduce any resources used that could be moved to the near-unlimited resources in the cloud. To do so, usually you will want your static content to be hosted on a CDN network such as Akamai (a Trustwave partner) in order to reduce the load on your servers. This often has a side-effect of speeding up loading times of your pages thanks to CDN networks having servers locally all around the world. However, directly moving resources that are to be loaded on your html web pages does introduce new security risks. For example, it may be possible to move your jQuery library to a CDN, but what would happen if that CDN in question happened to get compromised? An attacker who has control over the content served by the CDN could inject malicious script within the library, giving a similar result to a stored cross-site scripting attack. If resources are moved to the cloud, make sure to always check their integrity on systems which you control before using them.
For example, the new standard for subresource integrity can be used to mitigate those attacks within the browser. By adding a hash in the HTML file, it is possible to verify the integrity of the file before it is loaded. If the hashes do not match, the file is not loaded into the DOM. A sample example can be found below:
<script src="https://code.jquery.com/jquery-2.1.4.min.js" integrity="sha384-R4/ztc4ZlRqWjqIuvf6RX5yb/v90qNGx6fS48N0tRxiGkqveZETq72KgDVJCp2TC" crossorigin="anonymous"></script>
As an added bonus, this can be the first step towards removing inline JavaScript in order to use the new CSP headers to help protect against XSS.
Load balancers and caching methods
A load balancer can be used when multiple front-end servers are available. Multiple choices are available such as BigIP. The idea consists of allowing multiple servers to be used in order to divide the load evenly. While this does not inherently reduce the load, it lowers the risk of lack of resources becoming an issue and may allow you to scale better. Other proven-to-work methods involve using caching mechanisms. Some of them can be used to allow for normal caching of static resources such as Varnish, while others may be embedded into the code such as memcache in order to dynamically cache content. As usual, when adding new layers it is important to understand the increased attack surface that comes with them.
For both BigIP and Redis, make sure that the administrative interfaces are not available publicly facing. BigIP for example used to have a hardcoded ssh key which allowed any user root access on the machine, while a non-passworded access to a redis server allows a user to launch arbitrary LUA code.
When dealing with technologies that are application-bound such as memcache, it may be necessary to sanitize input in order to prevent any kind of injections. For example in memcache, it would be important to sanitize multiple characters which depend on the parser used or it may be possible to add new commands to the memcache server using injection. In some cases, this may lead to alteration of the cached data and even remote code execution. When dealing with memcache, in most cases if you sanitize the following characters you can get rid of most injection points:
*0x00 (null by)
*0x0a and 0x0d (\r and \n)
*0x20 (space character)
For more details on memcache injections and associated risks, see the following paper: https://www.blackhat.com/docs/us-14/materials/us-14-Novikov-The-New-Page-Of-Injections-Book-Memcached-Injections-WP.pdf
Regardless of which components you add, if they require a network connection between two systems, make sure to avoid plaintext protocols and instead opt for authenticated encrypted channels such as TLS and remember to always verify the certificate.
Application-related risks
Finally, the last few layers of bottleneck happen at the backend code and database layers. Before anything, always perform a project-wide search for sleep() and sleep-like functions. Programmers often used them while debugging (especially if they suspect a timing/race condition being the reason for a bug) and sometimes a few of them surprisingly end up making it into production and causing major slowdowns.
Other code that may present risk of DOS usually falls within two categories:
* Code that is heavy on the backend servers when used legitimately such as complicated search queries for databases. Those should present a CAPTCHA or have a rate limiter when used successively by the same user in a short period of time as they present easy targets for determined attackers. Other examples would involve functions that allow either the creation of files or upload on the server as these could create both disk bandwidth issue (read/write speed) and space issues. Similar issues can happen when large files are downloading, potentially exhausting read speed, memory as well as bandwidth on the server. Where appropriate, consider moving resources to CDNs and otherwise using load balancers can help mitigate the risk. These bottlenecks can be found through code review but also using code profiling programs.
* Code that is only heavy on resources when used in an unintended fashion. We will see a few examples below.
Billion laugh attack
A very common example of DOS using unintended behavior is the "Billion laugh" attack, an attack that subverts XML parsers to create a long recursive string ends up slowing the server down to a crawl. By default, most XML parsers allow you to define what is known as external entities. Similar to variables in code, you are able to assign a value to an entity, for example assigning the entity "&entity_a;" to return a value "A". It is possible for an entity to be assigned the value of another one. It is thus possible to design a request using multiple layers of recursion which is of minimal size; however when parsed it will be of a considerable size. See the example below:
<?xml version="1.0"?>
<!DOCTYPElolz[
<!ENTITYlol "lol">
<!ELEMENTlolz (#PCDATA)>
<!ENTITYlol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITYlol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
<!ENTITYlol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITYlol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITYlol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
<!ENTITYlol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
<!ENTITYlol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
<!ENTITYlol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
<!ENTITYlol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">]>
<lolz>&lol9;</lolz>
Disabling external entities as well as enabling secure parsing (on some parsers only) will help prevent these injections.
Splitting and tokenizing
I once encountered an API that took input that was separated via - delimiters. Since the application was in java, the code must have been something similar to something like this:
String[] tokens = parameter_input.split("-");
While benign in most cases, this code can actually be used to crash the server if the input is not sanitized. In java, when using split and split-like methods, strings are created even for empty strings when the token is used twice in a row. The memory for a string is thus allocated even for empty strings in the heap. As such, putting multiple tokens in a row, such as "------" would create multiple empty string objects and take a considerable amount of memory. It was thus possible to crash the java virtual machine by filling the heap memory until it ran out of space simply by sending a very long string of dashes. In order to mitigate this, simply limit the number of characters that can be passed to the parameter before parsing it with the split function.
Java hashing algorithm
On one of the applications I code reviewed performed optimization for searches on different strings in the search bar. This optimization was done using java hashmap to search over a small set of terms over which I had control. A hashmap will take as input an object and will internally act like a dictionary and use said object's hash as a key. The hashing algorithm in question is not cryptographically secure and has a relatively high rate of collision, not only due to it's low domain size (32 bits, and in some cases as low as 30) but also due to it's predictability.
A hashmap's advantage over other storage functions such as a binary tree is that in most cases values will be returned in constant time. This is however not the case when there are collisions over the hashmap. When this occurs, the items in question are instead placed in a linked-list and efficiency for looking up these items becomes of Order N instead of a constant Order. As such, it is possible to add a much bigger strain on the CPU if multiple terms with the same hash output were given to the search. Unfortunately testing was not ultimately done on the application, but here is some code that would have produced similar collisions if it had been Long values that were stored instead of strings:
This is the code on how hashcodes are generated for longs:
// from Long
public int hashCode() {
return (int)(value ^ (value >>> 32));
}
This is a simple optimization of said hash codes all with value 0:
for(long i = Integer.MIN_VALUE; i < Integer.MAX_VALUE;i++) {
Long long = (i << 32) + i;
System.out.print(long.hashCode()+" ");
if (i % 100 == 0)
System.out.println();
}
Similar optimizations could be done for string generation until multiple strings with similar hashcodes can be generated quickly.
File upload
Many company websites offer a career section and when it is not outsourced to a third party such as Jobvite, it may be possible to directly upload your resume on said websites. While file uploads are usually targeted heavily by application testers due to the high number of ways in how it can be exploited (XSS, RCE, LFI), one of the less common ways it is exploited is to create a denial of service. Just like processing power, disk space is a finite resource. If the file upload system does not limit uploads to a reasonable file size and does not prevent automation, an attacker could use this to upload files until the server inevitably runs out of space and crashes. Using a CAPTCHA as well as a "watch" service to alert when the folder is becoming cluttered is a good way to prevent these kinds of issues.
In conclusion, while it is possible to limit bottlenecks and limit the possibility of DOS, it is certainly not easy to do so and it may introduce new complexities into an already complex environment. However doing so will also help your product scale better in the long run and help prevent future down time from unintended resource exhaustion. As usual, the key to maximizing the gains from these technologies is to thoroughly research and understand them before rushing into implementation. Many of these technologies also increase the attack surface indirectly; consulting a security professional may prevent opening new holes in an otherwise secure system.
ABOUT TRUSTWAVE
Trustwave is a globally recognized cybersecurity leader that reduces cyber risk and fortifies organizations against disruptive and damaging cyber threats. Our comprehensive offensive and defensive cybersecurity portfolio detects what others cannot, responds with greater speed and effectiveness, optimizes client investment, and improves security resilience. Learn more about us.