A couple of weeks ago, Citizen Lab announced the discovery of the mobile component to the previously discovered Fin Fisher Toolkit (Reference Here).In this reveal, they talk about the many mobile variants, and a number of components included in each. Surveillance, file exfiltration, location tracking, and communication with the spyware's headquarters are but a few of the many pieces of functionality this spyware has. So as I'm reading their write-up (great job again guys), I find myself looking at the section that talks about how the Android variant's configuration is created.
https://citizenlab.org/2012/08/the-smartphone-who-loved-me-finfisher-goes-mobile/
The raw hex from their example above certainly shows a few interesting tidbits of information, such as phone numbers and domain names, but what does it really mean? How does the configuration get parsed and subsequently read? Well, I finally got my hands on a FinSpy sample for Android a short while back, and decided to jump in and find the answers to my questions.
The Extraction
Before I jump into parsing the configuration itself, it's interesting to know how it gets generated. So essentially the FinSpy sample has all of these blank, embedded *.dat files inside of it. Two hundred empty files inside of it to be precise. Most would probably think that these files would later get populated upon execution of the malware, or at the very least, be used in some way. I know I certainly thought so originally. However, it's not the files themselves that end up getting used by FinSpy, but the file headers that reside in the FinSpy APK sample. Specifically, the malware is using the internal file attributes, and the external file attributes for every *.dat file present.
You might be confused right now. Don't worry; I'll do my best to explain. So, if we look at the ZIP file format (All APKs are ZIPs), we can see that each Central Directory Structure (CDS) entry begins with a 4-byteidentifier (Documentation Here)of '50 4B 01 02' (or 'PK\x01\x02'). Sure enough, if we look at the FinSpy sample, we can see this towards the end of the ZIP file.
Based on the documentation, we can see that the 'Internal file attributes' begin at offset 0x24, or 36 decimal. The Internal file attribute is a word, or two bytes. We then see that the 'External file attributes' begin at 0x26, or 38 decimal. This attribute is a d word, or doubleword, and has a size of four bytes. So in total there are six bytes of data when these attributes are combined. So, let's see what the malware has for each file's attributes:
The Parsing
OK, so now the real fun begins—What does this data mean!?Well, if we start to inspect it, we can start to find some patterns. Let's start at the beginning. If we look at the first four bytes, we see a value of01 02 00 00. If we treat this as little-endian, it gives us a value of 00 00 0201, or 0x201. Converted to decimal, we get a value of 513. If we look at the total size of the configuration file, we see that is is 514 bytes long. Not totally exact, but who knows, maybe it's not just a coincidence. As we continue, we start to see more and more DWORDs that may in fact be 'size' values. I've demonstrated this in the following gif animation:
So maybe those values are in fact sizes. The next question is, what is the data inside of each section within the configuration file? It appears as though each size DWORD is followed by another DWORD of some unknown value. If we convert the first two that we see (90 FB FE 00 and A0 33 84 00),and treat them as little endian values, we get 16669584 and 8663968respectively in decimal. So what are these numbers? Well, the answer to this question lies in the FinSpy sample itself. If we decompile the underlying Java code, we see a large lookup table that has a number and its associated name.
Using this information, we can lookup the numbers we previously identified and get "TlvTypeMobileEncryption" and "TlvTypeMobileTargetOfflineConfig" respectively. The only remaining question to answer is what is the data following these parameters? Well, at this point we've stumbled on the data of the configuration itself. So at this point we've determined that the configuration file has the following structure:
<SIZE><TYPE> <DATA>
Where DATA can potentially be another 'section' of the configuration file. A full breakdown of this configuration can be seen below:
At this point we can start painting a pretty decent picture about what the malware does, even without digging too greatly into the sample itself. Values such as 'TlvTypeConfigTargetProxy' and 'TlvTypeConfigTargetPort'tell us that we will be connecting to some host on one of any number of ports specified here. 'TlvTypeMobileTrackingConfig' allow us to see that this malware has capabilities to track the victim device's location.
Going a step further, we can take all of the knowledge we've gained up to this point and completely parse the configuration of FinSpy, as shown below:
Since I'm such a nice guy, I decided to provide a few tools (some that have been seen in this blog post) to make your life easier (if you happen to be playing with an Android FinSpy sample). You can find the following three ruby scripts on SpiderLab's github page:
All files can be found in the following location: https://github.com/SpiderLabs/Malware_Analysis/tree/master/Ruby
So, let's wrap this up. What did we find and what did we see? Well, overall I think you'll agree that by parsing out the configuration(as opposed to just viewing it in a hex editor), it provides a much greater degree of insight into both the information present, as well as the decisions made by the authors. In truth, a very similar technique is used to format data going across the wire via TCP communications, but that's a post for another time perhaps. Additionally, the 'types' that were discovered in the Android binary shed quite a bit of light as to the overall functionality of the Spyware. In summation, I hope you were able to get some insight about reversing this configuration file along with the Android FinSpy family in general.