Blog

Unscrambling Custom obfuscation and Executable "infection"

03.12.2008 - 5:56 PM
One of the most important goals for any malware is to find a way to stay alive when the operating system restarts. Most malware will simply change the registry or change some ini files to stay alive. But these types of changes are easily detected by the average computer user.

Every now and then, we see malware using different techniques to restart and remain undetected by users. Instead of changing the registry, these programs read it, locate applications starting at boot time, and then infect those applications by appending code to them.

The appended code usually doesn't do anything malicious; it simply restarts the real malware component. Last year, we wrote a blog entry about a sample using such techniques. You can read it here

The sample described below not only uses those techniques, but also uses some tricks to prevent emulation and slow down analysis.

Custom UPX scrambler: Static Analysis

This malware was first packed with UPX and then scrambled by a custom tool to prevent automatic unpacking with upx -d. (For the readers who aren't familiar with UPX, it's an open source executable compressor, and the "-d" command line can be used to unpack UPXed files to produce the original executable.)

The scrambler used is probably shared on some underground forums. Like most of the scramblers we see, this one was weak and poorly written. The encryption (it burns my fingers to write encryption with such a poor algorithm) is using a simple XOR with a constant, with a minor twist. There is no obfuscation (except the few opcodes here and there used as junk instructions), and no anti-debugging code.

Here is the entry point of the scrambled file:

 We take a quick look into the disassembly and notice an interesting call:

If you compute this simple addition, you get 0x40A2D0. This is the content of EBP-10h, and the location to be called. At this point, you are at the start of a standard UPX stub. We could put a hardware breakpoint at this address to trace UPX and unpack it.

Rather than doing that, we are going to reverse the decrypting stub without even running our sample. To make our analysis easier, we first have to rename stack variables and use meaningful names, such as their holding values, as you can see on the screenshot below:

 It XORs bytes one-by-one, from 0x408000 to 0x40A426, with a key that gets modified using the loop index.

The whole decryption algorithm can be translated to IDC (IDA script), and opened in IDA statically, as you can see below:

#include <idc.idc>

static main()
{

  auto AddressToDecrypt, MyByte, i, Size;
  
  AddressToDecrypt = 0x408000;
  Size = 0x2460;
   
  for (i=0; i<=Size; i++)
  {
    MyByte = Byte(AddressToDecrypt+i) ^ (0xA543+i);
        PatchByte(AddressToDecrypt+i, MyByte);  
  
  }
}

This script could be simplified, but it doesn't really matter, as long as it does the job for us. A quick way to get an unobfuscated UPX file is to use the pe_scripts from the IDA Web site.

http://www.hex-rays.com/idapro/freefiles/pe_scripts.zip

You open the scrambled file in IDA, and open pe_sections.idc from the zip above to load every section in the IDB, and then execute the descrambling script from this blog entry. At this point, the whole file is totally descrambled in your IDB. You need only load pe_write.idc from the archive, and you can save the descrambled executable on disk.

All that is left to do now is to change the entry point to what we calculated earlier, which is 0xA2D0. (It's an RVA in the PE Header.)

Now you can use UPX itself to unpack the file (upx -d), and you get a file that is 100% identical to the one the malware author had before he or she started packing and scrambling it.

Now, let's continue the analysis.

Analyzing our unpacked sample

Upon first run, the sample checks whether it has the command line parameter "1". If this isn't the case, it restarts itself, using "ShellExecuteA" with the correct parameter, and then exits. This is probably a trick to defeat Emulators/sandbox. Some emulators detect this sort of trickery and have no problem monitoring the second process.

The second instance of our sample then does another anti-emulator trick:

First, the code gets the kernel32 imagebase, and uses it with GetProcAddress, on a function that isn't exported. GetProcAddress therefore returns 0. The problem is, some emulators (more than one might think), return successful values (non-zero in this case) when emulating API functions (or pretending to), even if the parameters are bogus.

On the other hand, chances are great that such an important function is emulated properly, and very few emulators should be affected.

If the return value is non-zero (which means a bad emulator is running through the code), the malware simply exits; otherwise, the malware dynamically loads a lot of imports, to resolve needed functions. This behavior is becoming very common, and the import table doesn't show much information anymore. This might slow down inexperienced analysts, because the API functions calls don't show up nicely in the disassembly.

In the past, most malware would have the import table filled with the needed functions, and only the very suspect ones were loaded dynamically.

The malware then performs several different tasks, such as making a copy of itself in the system32 directory under a randomly generated name, or checking for the Scheduler service (and starting it if necessary). It also sets the service startup to AUTO, in order to survive reboots. It then creates 24 jobs (one per hour), and every one of these jobs executes the copy of the malicious file that was created in the system32 directory.

Among other tasks, our malware is also going to read the registry and look for applications listed in Software\Microsoft\Windows\CurrentVersion\Run, and append code to them. Here is what it does to infect/backdoor genuine applications, to get them to restart the main malware file without any registry modification. Even if the jobs are deleted, the malware can survive a reboot.

The malicious program reads the registry to locate a file to infect, and then makes a backup copy of it. The file name is kept; only the last character of the file extension is replaced by an underscore. For example: the backup for myfile.exe is myfile.ex_

The original file is then opened with file mapping functions, and the malware makes a few checks. Infected files have a special marker "!" (exclamation point) at the very end of the file (last byte). If that's the case, the file is seen as already infected and is skipped. If the marker isn't present, two very classic checks are done to make sure the file is a win32 PE file. (It checks whether the first two bytes are MZ, and whether the PE header starts with PE\00).

If all checks are valid, it then parses the PE header and looks for the last IMAGE SECTION HEADER structure, because it needs this information in order to alter it. It then starts altering it, as you can see below (all the structures were obviously added by hand, in order to make the code more readable):

What happens now is that the malicious code first stores LoadLibraryA and GetProcAddress into a little asm stub inside the malware, and then starts copying it into the host file. Eventually, this process ends with the header modifications. Yes, the function addresses are hard-coded into the infected application, and are most likely to fail if those files are ever executed on a different OS, with different dlls. (I guess it doesn't really matter to the malware author, because the main goal here is to execute the main component when the OS boots, and the infected files are not likely to be moved around).

The malicious software also updates the host entry point so that it points to the newly appended stub, in order to take over at execution time. Once the main malware component has been executed, the host program starts, thanks to a jmp. Funny enough, the pointer to be updated before infection is set to 0xDEADBEEF. (This alone gives some information about the sort of author behind the code, or at least, I like to believe so.)

Here is the stub before it gets modified and injected:

The malware author doesn't seem to be very familiar with position-independent code, according to the mini injected stub (no delta offset is used to reference strings directly; instead, some dodgy additions are used here and there, and so forth). This might explain why the malware author didn't place any dynamic API resolution code into the stub either.

You might also have noticed that the path is going to be injected there, too. If the path is too long, it's going to overwrite the assembly code and corrupt it. (I don't remember any code checking on the size of the path.) The allocated number of spaces (twice the alphabet + 1) is smaller than the maximum size of a file path. On the other hand, an overwrite probably rarely happens, because most people don't rename their Windows directory using very long names.

Anyway, the injected stub does only a ShellExecuteA to run the malware file that was copied into the system32 folder under a random name. When the system starts, infected files execute the injected stub, and then execute host applications. The malware is therefore able to start without any obvious registry changes.

I hope you enjoyed this journey into a malicious application.

Security Researcher: Nicolas Brulez

Bookmark This Post:

Post a Comment: