I feel I need to clarify, for legal reasons, that this is nothing to do with any Harry Potter game. The reference is made because we are dealing with spells and magic, and I mean magic in the literal sense, not a reference to application security – although on some/most days it feels like magic.
Time-Of-Check Time-Of-Use (TOCTOU) and Race Conditions? What’s it all about? According to Wikipedia, “In software development, time-of-check to time-of-use (TOCTOU, TOCTTOU or TOC/TOU) is a class of software bugs caused by a race condition involving the checking of the state of a part of a system (such as a security credential) and the use of the results of that check.”
From MITRE’s Common Weakness Enumeration (CWE), it states this about TOCTOU (CWE-367), “The product checks the state of a resource before using that resource, but the resource's state can change between the check and the use in a way that invalidates the results of the check. This can cause the product to perform invalid actions when the resource is in an unexpected state. This weakness can be security-relevant when an attacker can influence the state of the resource between check and use. This can happen with shared resources such as files, memory, or even variables in multithreaded programs.”
In a nutshell, a developer writes some code to make a check of something (like the value of something held in memory) and then uses the result of this check later (“later” could be nanoseconds here).
The developer assumes that this value could not be changed in between the check and the use. Unbeknown to them, there is often a path, either in the code itself, or indirectly, to change this value after the check, but before the use. Because the change is being made out of band, rules and restrictions can often be circumvented as all these checks have already been done. When it comes to using the value, it is treated as being trusted – the check already took place. You can now see why as an attacker the TOCTOU vulnerability gets invited to the party.
MITRE’s CWE-367-page (which I’m about to paraphrase) gives a great example of this in a Set owner User ID (SUID) program which operates on files on behalf of non-privileged users. The program performs access checks to ensure it doesn’t use its root privileges to perform operations on files which would otherwise be unavailable to that user. To do this it has an access check which it calls to verify that the user (in their own security context) has access before it opens the file – as shown in the code below.
Figure 1: access() system call to check if the person running the program has permission
All works well and as intended. If the user doesn’t have write access to the file, then they are redirected to that door with the green exit sign above it. However, a problem, due to the way access() and fopen() work, they operate on filenames rather than file handles.
There is no guarantee that the file variable still refers to the same file on disk when it is passed to fopen() that it did when it was passed to access(). An attacker could let the access() check happen on a file they have write access to, but then replace the file with a symbolic link right after this to refer to another file they don’t have access to. It is at this point fopen() is called, but this now opens the replaced file (using the program’s privileged SUID access) which the attacker does not have write access to normally. You can see where this is going if the attacker points these lasers at /etc/passwd.
Now that you’re a sworn in member of the TOCTOU club, I can continue with the main story, the main event – you came here for the gaming.
You’re playing a massively multiplayer online game (MMOG), set somewhere in medieval England. You play the character of a noble knight, together with your trusty horse. You spend your time travelling between castles, towns and small settlements – speaking to people, collecting food, money, weapons and other useful items. When travelling between these places you generally get into trouble as other characters want to fight and rob you of your food, money and other items. When inside castles, towns and small settlements you’re safe – it is not possible to use weapons, the server will deny this - people revert to using the chat instead to fight.
Now, I just set out one of the conditions (or the primer) which loops back to the TOCTOU attack. If a character tries to use a weapon, then the server will first check if they are allowed to do this, very similar to the access() check in the initial example. These checks could be complex depending on the weapon and game, e.g., does the character have this specific weapon in their possession, does it have enough ammo, etc. In our example, we’re in medieval times so we’re talking swords, longbows and such. In this game, the check we know which happens when a character tries to fight (use a weapon) is that the server checks the location of the character. If they’re in a restricted area (castle, town, settlement) then they’re denied the ability to use weapons, people are safe. Outside these places, it’s a free for all. I got the pens out to illustrate.
Figure 2: Map of TOCTOU land, don’t go into the deep dark forest, it’s really not safe
It’s at this point that I bust out the Matrix quotes: “What you must learn is that these rules are no different than the rules of a computer system. Some of them can be bent. Others can be broken.” (Thanks Morpheus)
At first glance it seems like we’re out of luck – it is black and white; we need to be outside these restricted places to use our weapons. And it is, however, until we start to look into the functionality of weapons a bit more. This is where the magic is, literally.
Now being medieval times, we’ve got the ability to do magic (hence the ‘Harry Potter’ reference in the title – you were waiting for it). We can cast spells on people, turn them into frogs and such. In this game if we try and cast a spell to turn that other player (who is currently trolling us in the town pub) into a frog, then we are still denied, because we’re in the restricted/safe zone. This is, however, possible by putting our hacker hoody on, and we’ll get to it, but first I want to take you into the deep dark woods in the unsafe (unrestricted) zone to show you how to use the force, I mean magic.
Figure 3: Magic time in the unrestricted zone
The magic process consists of two stages, and first we’re going to need to find a big black bowl and collect some firewood (just kidding). So, in this game, first you cast a spell (let’s pick the frog one as an example) and then secondly, you indicate who the spell is to be performed against, by clicking on them. The developers assumed that these are two interconnected stages (to be performed one after the other), and they are, but they don’t have to be. You can see where I’m going with this hopefully?
I found that it was possible to go outside a restricted zone and cast a spell, which the server allows because that is intended – you fight others in the unsafe (unrestricted) zones, etc.
However, if you remember back, this is a two-stage process. I found that I could ‘cast’ (a spell) and leave it at that in the first stage, leaving the server hanging for me to click on a victim – but here’s the thing, I don’t click on anyone. It is at this point that I hop on my horse and travel back into the restricted zone (back to the pub, you guessed it), and then click on my victim who is to receive the spell to be turned into a frog.
The server then turns the player into a frog. This happens because the check for whether we are in a restricted zone and denying weapons was only implemented against the first stage of the magic process – the casting. The process whereby we pick our victim to receive the spell has no checks against it. The developers never thought about the scenario whereby this two-stage process is separated, especially not over different geographical locations within the ‘world’. They assumed you’d want to cast a spell and pick a victim right away, which you would if you were engaged in a fight, in an unsafe zone – it makes sense. They hadn’t thought about other scenarios.
In reference to the TOCTOU examples given at the start of this blog post, the thing (or ‘value’) I’m changing here is my location – moving from an unrestricted zone (unsafe) to a restricted zone (safe), between check and use, of the two-stage spell casting process.
And that is why, my friends, somewhere in TOCTOU land, there is now a frog sitting on a stall at the bar in a pub, sipping a pint with some peanuts. *Ribbit*
Figure 4: The safe zone is no longer safe - with TOCTOU we get to break/bend rules
Now, the example I gave was one relating to gaming – because I’m all about the magic. If you look (and look well), you’ll likely see TOCTOU vulnerabilities in all sorts of (exploitable) applications, from native operating system applications to online shopping, online banking and everything in between.
Now some have little impact, a mere inconvenience, such as being turned into a frog. Others, however, have greater impact – think of a TOCTOU vulnerability in an online banking web application. Imagine a transfer money function, stage one has a check which relates to checking the account balance (is there enough money in it?), and stage two, which carries out the actual transfer (transfer amount X to account Y, debit the account by X amount). Impact here going beyond a mere inconvenience, from the bank’s point of view that is. Using one-time coupon/discount codes simultaneously in an online shopping application is another example I’ll throw out there.
TOCTOU race condition vulnerabilities are great fun from a penetration tester’s point of view. Finding one is pure satisfaction because you’ve understood the application on a higher level – at one with it. *Adopts Yoga pose*
From a defensive standpoint, how do developers stop this from happening? Well, developers need to perform conditional checks and the subsequent operations as one 'atomic action.' An atomic what? An atomic action..."An atomic operation is one or a sequence of code instructions that are completed without interruption." (from Wikipedia on “Linearizability”)
Make it hard for an attacker to interrupt between check and use and implement lots of error handling so if something isn’t quite right then the application exits out gracefully.
Thanks for reading!