Introduction
This will be a 2 part post. In part 1 I will demonstrate deciphering a simple XOR encryption used in Lab 11-02 of Practical Malware Analysis by Sikowski. This lab also demonstrates a technique called inline hooking, where malware installs itself onto the system as a code library file (DLL) and then redirects execution flow to the memory address of the malicious DLL by modifying a few bytes in a legitimate function (in this case, it’s the send function which you will be familiar with if you have done any socket programming). The purpose of this post is to showcase a real-world method that malware can use to conceal itself, as well as document my approach to solving the problem and learn from my mistakes as well. However, since the lab includes an encryption algorithm, part 1 will contain handling that algorithm and how to decrypt it.
It’s important to note that the AppInit_DLLs exploit can now be more difficult for an attacker to pull off as of Windows 7 and up because Microsoft has implemented code-signing, giving developers the ability to sign their DLLs as well as the ability for the administrator to boot into a mode completely disabling AppInit_DLLs in the first place. However, today’s exploits and attacks use this same line of thinking; DoubleAgent is a good example, which came out a few days prior to this post and is currently unprotected by Norton, McAfee, Avast!, MalwareBytes, etc… DoubleAgent affects Windows XP all the up through Windows 10 and is 15 years old, according to Cybellum1, further demonstrating that we can learn a great deal from seemingly old techniques.
Background on AppInit_DLLs
To quote Microsoft2:
The AppInit_DLLs value is found in the following registry key:
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Windows
All the DLLs that are specified in this value are loaded by each Microsoft Windows-based application that is running in the current log on session.
You may already be able to see the potential danger here. Essentially, all of Windows utilities and programs that use User32.dll, which is any program that has a GUI at all, will now load a specified DLL file that is in the AppInit_DLLs list. Apparently, Microsoft has also seen the danger as noted by:
This feature may not be available in future versions of the Windows operating system…
There are very few executables that do not link with User32.dll.
from the same page.
However, it still exists in my Windows 10 registry at this location:
[HKEY_LOCAL_MACHINE]\MICROSOFT\WINDOWS NT\CURRENTVERSION\WINDOWS
But most modern Windows machines that have Windows 8 and up and that were shipped with a Windows sticker on them should not be susceptible to this type of attack with default settings on. This is because Microsoft pushes the computer retailers to enable something called UEFI Secure Boot by default. Secure Boot is not part of the operating system… It is part of UEFI which is said to eventually be replacing BIOS3. That said, it is possible to turn secure boot off, meaning, this attack could certainly still happen even on a newer system. To be sure that Secure Boot is on, hit the windows key (or otherwise go to Run) and type in “msinfo32”, find the Secure Boot State which is in the list that immediately pops up, and verify that it is set to “on.” If it is off, be sure to contact your system administrator prior to changing the setting if you are on a work machine as this could make some programs stop working… Such as those which rely on AppInit_DLLs.
Basic Static Analysis
Starting off with the files Lab11-02.dll and Lab11-02.ini, the first thing that can be done is simply opening up Lab11-02.ini in a text editor. Doing so reveals:
CHMMXaL@MV@SD@O@MXRHRCNNJBNL
This looks like some sort of ciphertext, but is otherwise at this point useless. Perhaps a more experienced researcher would recognize the algorithm by just viewing it?
For suspicious strings on the DLL file, I used PE Studio and Sysinternals Strings.exe and flagged:
OpenThread
THEBAT.EXE
THEBAT.EXE
OUTLOOK.EXE
OUTLOOK.EXE
MSIMN.EXE
MSIMN.EXE
send
wsock32.dll
spoolvxx32.dll
AppInit_DLLs
\spoolvxx32.dll
\Lab11-02.ini
ADVAPI32.dll
RCPT TO:
RCPT TO: <
RCPT TO: <
RCPT TO: <
kernel32.dll
Possibilities include network activity as noted by the wsock32.dll/send function, thread creation by OpenThread, possible file creation or activity using THEBAT.EXE, OUTLOOK.EXE, spoolvxx32.dll and MSIMN.EXE, and the RCPT TO strings seemed odd as well.
Kernel32.dll and ADVAPI32.dll were flagged on suspicion that the program may try to spoof these files, but more than likely, they are just library calls. Also, the backslashes before the ini and spool files indicated possible file creation or lookup.
There is one export listed in Lab11-02.dll which is called “installer.” Seeing as this is a dll file with no .exe, the assumption at this point is the installer function is a command line option to install the malware. The imports of interest include:
Thread32Next, kernel32.dll
Thread32First, kernel32.dll
CreateToolhelp32Snapshot, kernel32.dll
RegSetValueExA, advapi32.dll
Thread32Next and Thread32First allow a program to scroll through threads and search for one. CreateToolHelpSnapshot I didn’t recognize, so I searched MSDN and found4:
the ListProcessThreads function takes a snapshot of the currently executing threads in the system using CreateToolhelp32Snapshot, and then it walks through the list recorded in the snapshot using the Thread32First and Thread32Next functions.
In a nutshell, A “snapshot” of the current system’s memory state can be taken with CreateToolhelp32Snapshot (note the lowercase h in help) and then threads or processes can be enumerated through in in order to get information and/or search for something. The Thread32Next and Thread32First tell us that the malware is probably searching for a particular thread characteristic by starting at the first and then enumerating them all until one is found. Also note that several different terms are used for this idea of “checking each item (thread) in the list (snapshot) and moving to the next” – enumerated, walking, scrolling, iterating, etc… MSDN uses enumerate and thread walking to describe this procedure.
RegSetValueEx tells us this program writes to the registry.
Advanced Static Analysis
This malware is only made up of one DLL and one ini and was also not packed so we jump straight into IDA with the above clues to see what can be found:
Upon opening the file, DllMain (a DLL file’s equivalent to a “main” function) is actually fruitful and provides some important information:
Looking closely here, the malware first grabs the location of the system folder with a subroutine that I labeled get_sys_dir which just runs GetSystemDirectoryA. This API call just grabs the Windows fully qualified system directory, such as C:\Windows\System32 on some systems. Then we see Lab11-02.ini followed by a call to
CreateFileA. Don’t get confused, CreateFile is used to open existing files much of the time, not just create new ones. All of this information has shown us that the program is looking for [System_Root_Directory]\Lab11-02.ini aka C:\Windows\System32\Lab11-02.ini on many systems. Next we see that if the file is not found, the program exits; if it is, it calls ReadFile:
This means that a deciphering algorithm is probably near because this area of the program is trying to read that ciphertext that we found earlier: CHMMXaL@MV@SD@O@MXRHRCNNJBNL . If ReadFile is successful, we see a call to a function labeled sub_100010B3:
Jumping into that, we see an interesting function which takes the first char of ciphertext and compares it first to 0x0D, then 0x0A, which are 13 and 10 respectively in decimal. If the first char is neither, then it runs a loop which takes that char and runs one final subroutine on it, before incrementing and looping back around again:
What’s going on here is 0x0D and 0x0A correspond to carriage return and newline5. So this loop is reading “As long as the char is not a newline or carriage return, run the char through sub_10001097, increment and repeat the check.” If this isn’t the telltale sign of a decipher algorithm, I don’t know what is, so we enter the cipher algorithm and find:
That right there is the decipher algorithm, which as you can see, I’ve documented inside the code comments. We learn several things here:
- Several mathematical operations are performed on the key with constants 255, 666, and 4. After this, the altered key is now XOR’d with the input ciphertext char, decrypting it. Thus, the only unknown here is the key since the math operations are all being done with constants and we already have the input chars in the .ini file.
- Fortunately, we’ve been provided with the key in a way-too-easy fashion due to poor design. Remember the last function with the loop? The first argument pushed on the stack and submitted to the decipher algorithm was 0x32, which is the decimal number 50. This is actually the key.
- We have all of the information we need to decrypt the message in Lab11-02.ini manually, or we could even write our own program or script which does it for us. We could open calc, turn it to programmer mode, and then take each char’s hex ascii value and XOR it with the result of (0x32 ANDed with 0x0FF, multiplied by 0x29A, bit shifted right 4 times). The final key is actually a constant so we can just take each char of encrypted text from Lab11-02.ini, look up the ascii value in an ascii table, and then XOR that hex value with the number 0x821 and this will decipher the message manually.
- But none of that is even necessary. We can simply start this malware in a debugger like OllyDbg or x32Dbg, put a breakpoint at the memory address where this decipher algorithm is run, and then just step through it and grab the string after it gets deciphered by looking at the memory. So we let the malware do the work for us.
- Final Lesson: This is why everyone says to programmers never “roll your own” encryption! 🙂
The decipher using Advanced Dynamic Analysis
There are several ways we can decipher this text as mentioned above. One way (the way I used, but which is the longer tedious way) is to go into IDA and get the address of the decipher algorithm in the previous figure. In this case, that address is at offset 1097. This means if the file is loaded into it’s preferred virtual memory location, the address will be 0x10001097 in memory. We open the dll in OllyDbg (making sure that the ini file and the dll are in system32 and the dll is renamed to spoolvxx32.dll), now press hotkey Ctrl+G to “go to” a memory address and enter this address. We then place a breakpoint on the address then run the program through to the breakpoint. When the program halts, we see a screen depicted in the following diagram:
Here I’ve placed two breakpoints which are the red highlighted memory addresses on the left. You may recognize the first as the decipher algorithm mentioned above (you can even see the same AND, IMUL, and SAR instructions that we saw in IDA earlier). The second is the start of the loop routine which calls the decipher algorithm. The second breakpoint is not needed but is to illustrate the beginning of the loop. Now in order to get the deciphered text, we can simply press F8/step over several times until the EIP instruction pointer is at the highlighted instruction in grey in the diagram(POP EBP) at offset 10B1. Once we are at this instruction, we look at EAX which is on the top right corner of the diagram highlighted in RED. We see in this diagram the number 86C. The loop algorithm discards the upper byte, so we remove the 8 and we get 6C. Looking at an ascii table, we see that 6C is the character “l”. If you are confused about this discarding, scroll back up to the decipher loop and pay attention right after the call to sub_10001097; notice the line “mov [ecx], al”; this says “Take eax, go to the lower 16 bits (AX), now, of those lower 16 bits, take the lowest 8 bits,” which is called the “low order byte” or “al.”
To decipher the entire message, we can repeat this process by pressing/holding F8 and stepping through the loop again then back into the decipher algorithm and stopping at the POP EBP and capturing the next character in this fashion, all the way until the message is done. The easier and faster way is to simply place a breakpoint after the entire decipher loop subroutine and capture the “finished product” as ascii text in memory like so. Notice the grey highlighted line where I’ve placed a breakpoint, as well as the deciphered text at the top right which is being pointed to by the memory address currently inside EAX. This is how pointers are returned from function calls:
There are even many other ways of doing this but the moral of the story is, we can make the malware do the deciphering for us or even write our own algorithm to do it since we have the key.
In conclusion, we see that the encrypted text was an email address, “billy@malwareanalysisbook.com”. See part 2 for the last half of this analysis which will include the AppInit_DLLs/Inline hook portion and explain what the email address is for.