As part of the Zero2Automated challenge, which is now distributed every 3 weeks, the goal was to develop a script capable of extracting IP addresses from any Danabot sample from the same campaign.
Challenge
Unfortunately, after 2 weeks of regular work, I was unable to develop a method to extract these IPs from all the samples. I focused on retrieving them from just one of them, which prevented me from taking the time to develop a universal extraction method.
I will detail the binary analysis and extraction method for a single sample.
Initial Access
The challenge file is in VBS format. It is an obfuscated script that needs to be decoded to extract the payload. This script is not really part of the challenge, but it would be interesting to understand it.
This script is composed of several arrays containing variables. These are defined at the end of the file. (Yes, in VBS, you can define variables after using them in the code…)
The file also contains unnecessary code that does not interfere with its analysis.
After deobfuscating, analyzing, and renaming the different functions and parameters, the logic appears simplistic, and it is very easy to understand the code.
In summary, this script contains the data of the file to be dropped into the arrays. The script starts by displaying a fake error message to make the user believe that the program will not run. Then, it creates a file in the Temp folder, decrypts and writes the content of the arrays into this file. Finally, it executes the newly created file (which turns out to be a DLL).
Loader?
By analyzing the imports of this DLL, I cannot find any APIs related to connections to an external server. Using the wonderful capa plugin for IDA, I only find functions responsible for resolving APIs.
So, I read the MSDN documentation and placed a breakpoint on each function that could either initialize a connection or send a request, for example:
-
WSAStartup
-
WinHttpOpen
-
socket
-
WSASend
-
WinHttpSendRequest
But no breaks, and the program closes each time.
I then looked into a possible anti-analysis function, with a breakpoint on IsDebuggerPresent, but found nothing. I realized that IPs were still being contacted even after closing the program.
There is another executable being launched, so I placed a breakpoint on the CreateProcessW API, and indeed, the DLL is relaunched via rundll32.exe.
Main Binary
By analyzing this DLL, you can see that it contains many more functions, imports, and exports than before it was launched. This makes me think of data injection into the DLL.
We also notice imports from ws2_32.dll (Windows Sockets 2), confirming that this program is capable of contacting an external IP.
When debugging the program, I noticed that just before the WinHttpSendRequest call (sends a request to a server), a function displays IP addresses bit by bit. I delved into its analysis.
IP Resolution Function
Here is the pseudocode generated by IDA that I modified after analyzing this function with a debugger. This function takes a dword as an argument and returns a bit of an IP.
The DWORD passed as an argument is important in this function because it determines the decrypted IP. Once I understood the function, I went in search of this famous DWORD to develop my script and automatically decrypt the IPs.
In the meantime, I researched and came across a tweet (unfortunately, I don’t have the link) that discussed the different IP formats a browser understood. In these formats, hexadecimal was mentioned:
For example, you can access Google with this URL:
https://8efb2524
I entered one of the DWORDs in my browser to see what it looks like, and indeed, it was an IP. So, I analyzed a function that allowed me to convert hexadecimal to decimal so that it could be understood by WinHttp…
Shellcode
While looking for where the IPs were stored, I came across this function. It is a switch that takes a number as a parameter, which turns out to be a counter to decide which IP will be chosen, as well as a memory page to store the DWORD representing the IP.
The values are retrieved from the stack at the function’s prologue. From there, I didn’t understand where these IPs came from.
I then noticed that the “unpacked” DLL already contained the IPs in its data. So, I looked into how they were written.
I then set a breakpoint at the location where the IPs were written. That’s when I came across a Shellcode.
It is responsible for decrypting content to unpack the DLL, and in this content, we find the IPs needed to contact the C2 server.
While trying to understand this Shellcode, I was unable to figure out what data was being decrypted, how, with what parameters. Analyzing it in detail would have taken me too much time.
Failure
So, I was unable to complete this challenge. I did find the IPs in the binary and their location! However, I did not find a technique to retrieve these IPs in all Danabot samples.
Perhaps it’s because I wasted a lot of time neglecting the fact that the DLL was launched a second time, analyzing the VBS script that was not part of the challenge, or analyzing the function that converts an IP from hexadecimal to CIDR format…
Nevertheless, I had a blast trying to understand this challenge, which gave me a lot of trouble. Failure makes me realize that I still have a long way to go in this field and motivates me even more to learn more.
For the upcoming season, I plan a series of 3 live streams in which I will have fun solving CTFs related to reverse engineering and malware analysis on Twitch!