Contributors: Lloyd Macrohon and Rodel Mendrez
Recently, we got a chance to investigate a REvil Ransomware sample from one of our DFIR investigations. During analysis, we encountered a few stumbling blocks that made the investigation a little tricky, namely unpacking and string deobfuscation. In this blog, we will show how we manually unpacked the malware and then how we deobfuscated the strings used by the ransomware.
The particular sample we are going to investigate has the SHA256 Hash: 6ff970f1502347acd2d00e7746e40fba48995abbe26271d13102753c55694078.
Packers are essentially tools that are used to compress a Portable Executable (PE) file. Many malware authors utilize packers with their malware to obfuscate and make it a bit harder to statically analyze code. If you want to learn more about packers, you can read our blog about this here: https://www.trustwave.com/en-us/resources/blogs/spiderlabs-blog/basic-packers-easy-as-pie/
We began by trying to determine what packer was used by this malware. The “Detect It Easy” tool failed to identify the packer and no signature was found. It also found no packed PE sections. Interestingly the sample had non-standard section names. We also took note of the entry point at RVA (relative virtual address) 0x9D58 which is within the .text section of the PE file.
Disassembling the executable, we can see right at the entry point the use of VirtualAlloc() API to allocate a new memory in the address space, then "rep movsb" opcode to copy data to the allocated memory, then, at the end of the function, is an opcode "jmp eax" that leads the instruction pointer to a new entry point in the allocated memory space. Now that looks interesting…
Here is what the code looks like when decompiled in IdaPro:
Next, we dynamically analyze the file using x64dbg (a PE debugger tool) to see what’s being copied to the allocated memory. In the screenshot below, after calling VirtualAlloc() API, this instance allocated memory at the base address 0x1240000 (this memory address varies in each run).
By dumping that memory address, we can visually monitor what has been copied to that address. After the malware has copied the data to the memory address, it turns out that it was the PE image of itself.
However, the jmp eax we mentioned earlier - jumps to a new entry point at RVA 0x9EA0 (virtual address is therefore - 0x1240000 (base address) + 0x9EA0 = 0x1249ea0):
Next we follow that jump to virtual address 0x1249ea0, and yet again we encounter another VirtualAlloc() API call. So we dump the new allocated memory address and monitor it:
The malware then starts to decrypt a blob of data embedded in the PE file starting at RVA 0x13DE2 and writes it to the allocated memory. That blob of packed data (at 0ffset 0x13DE2) is within .text section of the PE file.
After unpacking, another PE file is revealed. This time, it is the ultimate payload – a REvil Ransomware. You will notice the new section names, and standing out is a non-standard section name .raimo.
We can dump this unpacked PE image and manually fix the IAT (Import Address Table) so that we can continue analyzing it statically. You can reconstruct and fix the IAT with Scylla (this is available in x64dbg) .
Now that we have manually unpacked the file, we can statically analyze it. However, another stumbling block is that it leverages string obfuscation to hide the nature of what it’s doing. You can see in the screenshot below a bunch of cross-references – these are calls to the decode_string function:
The encrypted strings are stored in parts of the binary. One part is a table of the encrypted strings that the malware uses and another part is the ransomware configuration. Each function call to decode_string is preceded by its parameters, by passing them through the stack, these are: pointer to the output address, the key offset, key length and encoded data length. We will follow this example:
At relative virtual address 0xF060 is the data table base address which we named as encrypted_data_table which is 3048 bytes long. This is found in the .data section of the PE image.
For this example, the encrypted data is at offset 0x91B from the base address of the data table, or 0xF060 + 0x7B4
And the following parameters are hardcoded:
Here is the encrypted_data_table (truncated) after we re-based it to Zero:
The encrypted string block is therefore at 0x7B4:
The key is 13 bytes long :
[82 7D AE B7 37 35 9D 60 DA 8D DB CA E3]
And the encrypted string is 314 bytes long:
[0c 67 57 04 37 69 34 07 f7 16 37 33 30 88 ec e3
46 13 61 0d 75 d6 5b 0a 54 2a d5 7e 1d 32 9d 79
c8 8c b9 e1 23 50 90 b5 6a 84 8b f9 80 16 9a 99
58 11 24 30 d1 ac 3f 5c b6 77 b0 14 37 ad 69 be
81 d6 ea b5 a8 2c f2 14 d4 74 13 6f 2b af 1f fa
28 e0 58 34 be 7c d7 2d 79 90 94 de 4a 01 13 71
fa e6 36 ca 88 cd 3b 82 4d ac 63 02 1a e8 05 7e
71 44 3c 75 4a 60 93 2d 58 01 3a 24 98 b3 e5 7a
9b 3f 43 6f e2 3a 69 36 5b f4 a0 b1 2a dd ff 41
59 c3 77 88 c3 41 df 2a 4d ea d7 91 61 5a 53 98
1e df 56 da 4e ea e0 51 e8 8d 57 71 fc 90 79 23
fd 36 0d 14 24 e5 30 4d a7 cf 23 06 c2 7a 2d 11
11 ea ec 3b cb 8d fc c0 06 5d 8c ff a2 82 d8 3a
0d 39 a5 4c 15 6f 53 93 e2 d4 35 55 5a f5 02 d8
e3 a3 cb 2a 2b 4b 65 1f fc aa 14 20 a4 d5 ec 34
23 60 73 03 b4 65 ab e2 bd c4 cf 1f e7 37 24 b8
93 0a 16 b2 79 74 4e 30 3b ce b4 fe ac cf 3d fd
91 7f 96 c2 9f 6a 4c 5b fa fc d0 05 0e 36 14 75
19 24 dc 5c 7e 74 87 a4 9b 34 62 56 9d 4d e9 d2
12 c5 61 a8 67 e1 c8 5d 6e 6e]
Reverse engineering the decryption code in the malware shows that it's actually just the stream cipher RC4. Code snippet below is the RC4 algorithm: initialize sbox and key scheduling
Code snippet below is the actual decryption of data:
After reversing this to C, it was pretty straightforward to convert it to Python so we could run it in IDA Pro.
This now allows us to take the encrypted block above with the following parts:
This decodes a Unicode string as seen in the screenshot below:
Because each encoded string has its own unique key and variable length, it becomes cumbersome to decode every string. But fret not, at the end of this blog, we share the IDAPython script to aid you with the decoding process.
The second part of the obfuscated data is the ransomware configuration which basically uses the same RC4 algorithm. This encrypted configuration is stored in the non-standard named section called .raimo.
In the screenshot below we highlight the RC4 key “VNz47r3Wz2xT7DP1XqPa2MYcwUx8uRex”, the CRC hash of the encoded data which is 0xB6C2E135, and the length of the data is 0x6B02 (27394 bytes).
The resulting decrypted configuration file looks like this:
When we finally unpack the file and deobfuscate the string, the process of reversing the code statically is so much easier. We won’t, however, go into further detail about the Ransomware itself as there are very good analyses on this malware elsewhere, such as: https://www.acronis.com/en-eu/articles/sodinokibi-ransomware/
As mentioned earlier, we also wrote an IDAPython script to help deobfuscate strings hidden by this malware which may aid in the analysis process. You can find it here: https://github.com/bizdak/malware-analysis/blob/master/revil/revil.py
Decoded string after running the IdaPython script:
Happy reversing!