Towards the end of 2020, a new vulnerability in MongoDB was found and published. The vulnerability affected almost all versions of MongoDB, up to v4.5.0, but was discussed and patched appropriately.
The vulnerability, CVE-2020-7928, abuses a well-known component of MongoDB, known as the Handler, to carry out buffer overflow attacks by way of null-byte injections. Exploits using this vulnerability are available on Dark Net marketplaces, with prices averaging $40,000 (based on the current market values of Ethereum and Bitcoin). This vulnerability has since been patched, and so far, no documented evidence of it being used in remote code execution (RCE) or remote file inclusion (RFI) has been found on the Clearnet.
Before we get into how this vulnerability works, let's first discuss how a buffer overflow works. More importantly, what exactly is a buffer overflow?
Buffers are memory storage regions that temporarily hold data while it is transferred from one location to another. Buffer overflows happen when the volume of said data exceeds the storage capacity of the buffer. The program then, whilst trying to write the data to the desired location (buffer), begins to overwrite adjacent memory locations (blocks).
Attackers manipulate this coding error by altering an application's execution path and overwriting elements of memory, leading to damage or loss of existing files, exposure of data, executing malicious payloads, etc. Basically, attackers use buffer overflows to corrupt an application's execution stack, execute arbitrary code, and take over a machine. These types of attacks are still very common, as buffer overflows are often given less scrutiny because they are less likely to be discovered by attackers and more difficult to exploit (attackers would need to know the memory layout of a program and details of the buffer).
Several types of buffer overflow attacks exist, but the three most common are: stack-based, heap-based, and format string attacks. The MongoDB vulnerability in question specifically uses two common weaknesses: ‘Improper Neutralization of Null Byte or NUL Character’ (CWE-158) and ‘Buffer Copy without Checking Size of Input’ (CWE-120), which fall under stack-based buffer overflow.
As mentioned earlier, the vulnerability abuses the component known as the MongoDB Handler. The Handler takes operations from the source trail file and creates corresponding documents (rows) in the target MongoDB database. There are two important data structures in MongoDB: records and collections.
A record is a Binary JSON (BSON) document composed of field and value pairs. The values of fields inside records may include other documents, arrays, and arrays of documents. A collection is a grouping of MongoDB documents and is the equivalent of a Relational Database Management System (RDBMS) table. Databases in MongoDB hold collections of documents, and MongoDB documents within a collection can have different fields.
The attack in question uses something called a null-byte injection to get around the Handler and achieve buffer overflow, but what is a null-byte injection? And what is a null-byte?
Simply, a null byte character is something like %00 in URI encoding or 0x00 in hex, and is used to terminate strings. Injecting these would bypass data sanity filters, confuse applications on when to end strings, and then manipulate them into performing actions.
Here is a very basic example: you want to upload a malicious payload, called malicious.php, but the only extension allowed is .pdf. You would then name the file malicious.php%00.pdf. The application would read the .pdf extension, validate the upload, and throw out the end of the string due to the injected null byte, thereby successfully uploading a malicious.php.
Although the above example is extremely simple, the crux of the matter is the same when considering the majority of applications today which are developed using higher-level languages. These applications all require processing of high-level code at the system level, which is usually achieved by using C/C++.
Null bytes in C/C++ represent the string termination or delimiter (meaning the string must be stopped from processing immediately). All bytes following the delimiter will be ignored. If the string loses its null character, the length of a string becomes unknown until the memory pointer happens to meet the next zero byte.
Obviously, this unintended consequence could cause unusual behaviour and introduce vulnerabilities within the system or scope of the application. Several high-level languages treat the null byte as a placeholder for the string length as it has no special meaning in their context. This difference in interpretation allows null bytes to be easily injected inside applications to manipulate their behaviour.
In MongoDB, a null byte injection could allow attackers to overwrite fields in the database to which the application logic denies them access. Taken to its logical conclusion, it may be possible to trick the application to write or read from restricted databases or collections.
Consider the following code. It allows users to insert random objects into the collection by passing an array of objects in the GET command. However, it does not permit them to insert the field ‘verified’.
Figure 1. Connecting to the MongoDB instance to view and modify the collection ‘students’.
Inspecting the database after executing the above, we can see that the object was created inside the collection ‘population’, but with the exception of the ‘verified’ field.
Figure 2. Only the fields ‘name’ and ‘age’ were modified.
By injecting a null byte into the array key, we can bypass the check and allow the field ‘verified’ to be stored in MongoDB.
Figure 3. Passing a null-byte ‘chr(0)’ to the field ‘verified’.
MongoDB will filter out anything after the null byte, and checking the collection shows that the ‘verified’ field has now been populated.
Figure 4. The null-byte successfully allowed the field “verified” to be set to the desired value.
It follows that forcing a bypass to the checks in place could technically allow an attacker to inject any piece of arbitrary code that would then be stored by MongoDB and run at will.
This oversight was patched by the MongoDB team by enforcing additional checks in $arrayToObject, which converts an array into a single document. The new checks were encouragingly simple: key fields should not be allowed to contain null bytes, and check that $arrayToObject produces an error when a key does contain a null byte. You can view the source code here: https://github.com/mongodb/mongo/commit/1772b9a0393b55e6a280a35e8f0a1f75c014f301?diff=split
Figure 5 shows the test script written in JavaScript that tests whether the $arrayToObject aggregation operator produces an error when the key contains a null byte. It defines four test cases, each of which is passed to the assertErrorCode() function along with an expected error code. Each test case consists of an aggregation pipeline that uses the $replaceWith operator, that is passed a $literal array which contains key-value pairs. The first test case passes an array that contains a key with a null byte (“a\0b”), while the second test case passes an object that contains a key with a null byte ({k: “a\0b”, v: “blah”}). The third and fourth test cases are respectively similar to the first two, but also include a $out stage that writes the output to a collection.
The assertErrorCode() function asserts that a given operation on a collection produces an error with the expected error code. In this case, the expected error codes are 4940400 and 4940401. 4940400 corresponds to the case where the key contains a null byte, while 4940401 corresponds to the case where the key is of type string, but contains a null byte.
Figure 5. Snippet of the test script arrayToObject.js, a specific feature of the MongoDB aggregation pipeline
Figure 6 shows a snippet of the server-side implementation of MongoDB’s BSON-to-JSON conversion logic. It starts with comparing the type of the first element inside valArray with the BSONType::String constant, checking to see if the element is of type string. It then retrieves the string value of the first element, uses the find() method to check if it contains any null bytes (“\0”). If it does, it will raise an exception with the message “Key field cannot contain an embedded null byte”. Similarly, the second part of the snippet raises the same kind of exception but due to a different error code, where the key and value aren’t extracted from an array but passed as separate parameters.
Figure 6. Snippet of the server-side conversion of BSON to JSON
In addition to this particular CVE, more were discovered and remediated around the same time: CVE-2020-7923, CVE-2020-7925, CVE-2020-7926, CVE-2020-7929. All of these pertained to attackers using specifically crafted queries (with some containing regex) to trigger a denial-of-service (DoS). These were also patched, but in some cases were still found to affect modern enterprises (This security bulletin from February 2021 showed that IBM Private Cloud was still vulnerable). All of these pertained to attackers using specifically crafted queries (with some containing regex) to trigger a denial-of-service (DoS). These were also patched, but in some cases were still found to affect modern enterprises (This security bulletin from February 2021 showed that IBM Private Cloud was still vulnerable).
Buffer overflows aren't impossible to prevent, but they can arise from almost anywhere, mostly due to oversights in software logic. It may seem like a losing battle at times, but given the massive amount of open-source documentation available to security researchers on the Internet, these vulnerabilities are being discovered and fixed at a much greater rate than before. Scrutinizing patch notes and making sure enterprise software is up-to-date is one of the greatest defences against buffer overflow attacks.
The aforementioned vulnerability CVE-2020-7928 is covered by Trustwave’s dbProtect and AppDetectivePro products, as part of a missing patch check. Trustwave SpiderLabs security researchers routinely scour the web for new vulnerabilities and new patches that address these vulnerabilities (if they exist) and bake them into our products to provide the best up-to-date protection.