Analyzing PDF Malware - Part 3B
Down that dusty trail…
As the big blue letters above state, this is part 3B of the Analyzing PDF Malware series. If you haven't read any of the preceding posts you can find them here: Part1, Part2, and Part3A. We will be building off our analysis from those initial posts. I will go into detail about my system when possible, so if you are following along at home your mileage may vary depending on your own particular setup. Also note that all images in-line are clickable and will display a higher resolution picture in a separate window.
…
In Part3A we were able to successfully disassemble our second stage shellcode. Cross that goal off the list. Now we move on to our second goal.
Our Goals:
- Disassemble the second stage shellcode
- Analyze the disassembly to determine its full capabilities (Today's Goal)
- Track down and determine the ultimate goal of the malware
The man behind the curtain…
Let's jump right back into the analysis, shall we? We should have left things off with a nice reconstructed listing of all of our shellcode as if it had just finished decoding the second stage. This will allow us to use our interactive disassembler (IDA) to statically analyze the newly decoded instructions.
Fig1. – IDA Functions window showing newly identified functions
After tracing, untangling, and decoding layer upon layer, we finally have a view into the heart of the malware. Previously I commented on the brevity of this code, but even with that being true it still doesn't make sense to discuss every line of assembly here ad nauseam. Instead we will just investigate the significant portions. However, I have gone ahead and renamed each of the functions shown in the screenshots below and added comments to explain the line-by-line detailed functionality if you would like to investigate deeper.
Once the XOR decoding has taken place, the control is passed to the function located at offset 0x26. I have renamed this function "stage2_main", since it contains the main flow of logic, calls to additional functions, and the endpoint of the code.
Fig2. – (sub_26) "stage2_main" procedure
Wait! Before we dig in too far – let's investigate a bit of history first to properly set the "stage" (pun intended). Ten years ago, waaay back in 2002, The Last Stage of Delirium Research Group (aka LSD) , based out of Poland, proposed the seminal method for performing lookups of API function names by using a special string hashing technique. Instead of storing and performing a character-by-character comparison of the desired API function name, a uniquely computed numeric hash of the name could be used by performing a logical bit-shift (ROR) followed by an Add. A conveniently pre-computed table of these hashes can be found here for your reference. Matt Miller (aka Skape) built off of LSD's idea in his 2003 paper "Understanding Windows Shellcode". Both of these papers should be mandatory reading if you plan to spend anytime looking at Windows shellcode. The techniques described within are still widely deployed in malware that we see today. They allow for compact code creation with the added benefit of not producing any easily identifiable strings that may otherwise tip its hand as to its malicious intent. *foreshadowing* In case you are wondering, it is no coincidence that I'm explaining this particular bit of computing history, this is the exact technique we will see employed within our shellcode in just a bit.
One of the first things that position independent shellcode tries to do is to orient itself in memory and locate the base address of the kernel32.dll library. This is done almost universally because kernel32.dll gives access to key functions that allow the shellcode to be able to load additional libraries and resolve their symbols through LoadLibrary and GetProcAddress respectively. This effectively allows the shellcode to execute customized arbitrary code on the machine through access to the full WinAPI or any other 3rd party libraries.
Back in stage2_main after the stack has been initialized we see our first call instruction at offset 0x5B. This call references the get_kernel32 subroutine located at offset 0xA8 and passes control. Note the green hex values being pushed to the stack just before the highlighted call statements. These values are the hashed values of the symbolic function names we were previously discussing. The hash is being passed as a stack argument for use in a subsequent call.
Fig3. – Hash of LoadLibraryA being pushed to the stack prior to get_kernel32 call
The get_kernel32 subroutine uses a handy trick to locate the base address of our coveted kernel32.dll library by accessing the Process Environment Block (PEB). The PEB is a data structure within Windows that is created for every running process. The beginning of this data structure is always located at a known offset within the current Thread Information Block (TIB) at fs:[30h]. The PEB structure contains a wealth of meta information about the running process including what modules have already been loaded into its memory space.Since the kernel32.dll module is always the second one loaded into the process space, the first being ntdll.dll, all we need to do is enumerate through the linked list of loaded modules until we get a match on our hash for 'kernel32.dll'.
Fig4. – Finding the base address of the Kernel32.dll module.
The get_kernel32 subroutine accomplishes this important task and then returns the base address of the API into the accumulator to make it available for the next step.
Fig5. – (sub_A8) "get_kernel32" procedure
If we continue to follow along in the assembly, we see that get_kernel32 has returned the module's base address and it gets pushed to the stack with the PUSH EAX instruction. Now we call a new subroutine located at offset 0xE2 renamed as "find_func". This subroutine takes two stack arguments. The first is the module base address that we just pushed, and the second is the hashed function name we want to lookup. This subroutine acts similar to GetProcAddress that we mentioned earlier as being one of the key functional requirements of shellcode, in that it will enumerate through the export table of the provided module looking for a match of the supplied hash.
Fig6. – (sub_E2) "find_func" procedure
Once the routine finishes successfully, a function pointer is saved into the accumulator and then executed via the CALL EAX instruction. We can see an interesting pattern within the stage2_main subroutine repeat itself several times:
Fig7. – A repeating pattern of core instructions.
Now that we know the basics of what each of these primary subroutines do, we can reconstruct the function calls by decoding the hashed names in each pattern and applying what we've learned:
1. Kernel32.LoadLibraryA(urlmon.dll)
2. Urlmon.URLDownloadToCacheFileA(szURL)
3. Kernel32.CreateProcessA(szFileName)
4. Kernel32.TerminateThread(hThread)
To explain the above lines in English; once kernel32.dll is located, LoadLibraryA is used to load the urlmon.dll library. The URLDownloadToCacheFileA export from the loaded urlmon.dll module is then referenced to download a file from a provided URL into a temporary cache file. The cache file is passed to and executed via the CreateProcessA method. Finally, TerminateThread is called to end the program execution. The function names are pretty self-describing, but if you're new to them or just want to read more of the details you might as well go bookmark this page right now: http://msdn.microsoft.com/en-us/library/
Fig8. - Proximity view of the called function within the shellcode
Details - the little things that make the big things happen…
So that was the quick(ish) run through of what the shellcode embedded within our malicious PDF sample was actually doing at a system level. It is the very definition of "Download and Execute" shellcode. We have every line of deobfuscated code, and we can statically analyze it to our hearts content. We should feel pretty confident in what the capabilities are now without fear that we've missed some hidden functionality.
In the next post of the Analyzing PDF Malware series we will switch tracks a bit and perform some dynamic analysis of the same shellcode and attempt to verify our static based findings. You did the homework from part 3A and created a PE wrapper for the shellcode already, right? ;-)
--@Rnast
Tools Used:
- IDA - Multi-processor disassembler and debugger
References:
- "Win32 Assembly Components" by The Last Stage of Delirium Research Group (LSD), 2002
- "Understanding Windows Shellcode" by Matt Miller (Skape), 2003
Resources:
- "xord.idb" – Commented IDA database file.
Special Thanks:
- @jgrunzweig for his OmniGraffle wizardry.
ABOUT TRUSTWAVE
Trustwave is a globally recognized cybersecurity leader that reduces cyber risk and fortifies organizations against disruptive and damaging cyber threats. Our comprehensive offensive and defensive cybersecurity portfolio detects what others cannot, responds with greater speed and effectiveness, optimizes client investment, and improves security resilience. Learn more about us.