Most of the features from the old version, such as the IRC bot component, are still present. The code for that particular feature hasn't changed a lot, and a lot of code from the very first variant is still used. On the other hand, a few features appeared, including, but not limited to, memory residency, htm, php and asp infection and pseudo polymorphism.
I) Comparing Old Code and New
1.1 - Getting the image base
In my previous blog (which you might want to keep open for comparison), I described how the malicious code gets the kernel32.dll image base. The stack was used to get hold of a kernel32 address, and the image base was calculated from there. As I recall, it was using this technique for every file. The newer variants either use the stack or get the address of an imported kernel32 function directly from the import address table (IAT).
First technique: Reading from the stack
Second technique: Reading from the IAT
I first assumed that the malicious code would only use the stack if the host file didn't import kernel32 functions, but some infected samples showed that I was wrong. Files with imported kernel32 functions were infected, and were using the stack technique.
In order to get the image base of the DLL, file infectors usually apply a mask (AND address, 0xFFFFF000, as you can see in red above), and then look for the MZ header. In my previous post, I mentioned how the checks were obfuscated, and didn't use direct references to the "MZ" letters. In the new variants, only the code flow is obfuscated, using garbage instructions, and the checks are very standard, as seen here:
There aren't many garbage instructions shown, because I removed parts to show the real code.
1.2 - The encryption
In the first variant of the virus, the actual decryption loop was hard coded and free of garbage instructions. A simple XOR with a one-byte key was used to decrypt the virus body, as you can see on the figure below:
Apart from the polymorphism, the new decryptor isn't much more advanced. This time, the decryption loop uses the ADD operator and a 32-bit key, which is still very weak. The decryptor code uses random garbage instructions, various registers, and program flow. None of the techniques are really fancy, and all have been used in many virus and packers for years. The only benefit for the virus is that every infected file has a different decryptor, which makes it a little harder to detect. One of the major differences between this newer code and its earlier version is that the virus body isn't fully decrypted after the first decryption loop. Parts of it are decrypted later.
1.3 - The IRC component
The IRC component code hasn't changed much at all. Parts of it were slightly modified (new PRNG). The next two figures illustrate the changes:
OLD variant - the !get command
Newer variant - the !get command
The only thing that changed here is the way the virus generates a pseudo-random number. Before, it used the RDTSC instruction, whereas now it uses a specific function. For the sake of clarity, here is the new PRNG code:
- We can see a lot of cross references (green text) to the PRNG function, this is because the virus uses a polymorphic engine, where some pseudo-randomness is necessary. This function is used when generating various part of code.
- The Download string that you can see in both of the previous screenshots is the actual "UserAgent" that the virus uses when it grabs the files. There is no UserAgent restriction on the malicious sites, however.
1.4 - The file infection
This is a file infector, so what changed on the infection side? To be honest, the PE infection code is pretty much unchanged. It's very similar to the first variant, with new additions for the polymorphism. The infection marker has also changed, but it's placed at the same location (MZ+20h).
Compare the code from the newer variant (shown below) with the code analyzed previously (in the older blog):
I didn't bother with the PE structures in IDA, but you can clearly see it checks for the same things. MZ (MZ header), PE (PE header), 2000h (Characteristics) and 2 (Subsystem). If a file doesn't match those mandatory characteristics, it won't be infected.
Here is another snippet, where you see the code responsible for adding the infection marker to the infected file, updating the size of image, and a few others fields. Right after that, you see the code that determines how the virus takes over the host (PE header modification or jump inserted at host entry point):
You may have noticed in the first figure in this section that the code checks for a flag. A flag is set if the file is HTML. This leads to the second part of this blog.
II) New Features (It's not waterproof yet :-)
2.1 - Web page infection
Among the new features is the ability to infect Web pages on the local machine. Whenever the file infector has an access to a file on the hard drive, it checks whether the files is EXE, SCR, HTM, PHP, or ASP, and then acts accordingly. For the PE files, the code discussed above is used for the infection. For HTML pages, the virus actually injects an iframe at the very end of the page:
NOTE: Just before the actual iframe code, we can see a string used in the virus. This isn't added to Web pages, but to the host file. Since the machine is already infected, the virus author doesn't want the machine to be infected again, and therefore blocks access to the malicious page with the host file modification.
Here is the code responsible for the extension checks:
Here is the code responsible for the Web page infection:
The first thing the virus does is scan the Web page for the malicious domain, in order to know whether the file has already been infected. If the page isn't already infected, the virus looks for the </BODY> tag. The malicious iframe is inserted in the document just before that tag. Here is an example of an infected page:
The decrypted script is a page full of exploits. Here is a snippet of decrypted content:
2.2 - Memory residence
In the first version of Virut, the infection was performed by infected files. As you can see in the code snippet from my previous analysis, the infection code was part of the virus body and was executed. With the newer variants, the infection code is injected into almost all of the running processes (it skips the first four). The virus injects itself and hooks several of their functions: ZwCreateFile, ZwOpenfile, ZwCreateProcess, ZwCreateProcessEx, and ZwQueryInformationProcess.
Here is the code responsible of the hooking:
Before hooking and injecting code, the virus calls LookupPrivilegeValueA and then ZwAdjustPrivilegesToken. Right after that, the CreateToolhelp32Snapshot function is called to get the running processes, along with Process32First and Process32next. The first four processes are skipped, and then all the other processes are injected (including Winlogon and explorer.exe).
In order to do this, the OpenProcess function is called on the Process ID, followed by calls to ZwProtectVirtualMemory and ZwWriteVirtualMemory. Eventually, a call to CreateRemoteThread is done to execute the malicious payload in the infected process.
This is what a Zw* function looks like once it has been hooked (in comparison with a non-hooked function):
And this the hook function:
When a mov eax, xxx is overwritten by a CALL, the original instruction can be found in the hook function, which is specific to the functions it hooks, and performs different actions. Some of them lead to the infection code (such as the code that checks for file extensions), and this is how the infection occurs.
III) Other Miscellaneous Information
3.1) Network traffic
When an infected program is executed, it starts to inject the running processes. One of them then connects to an IRC server to get the files to download and execute on the machine. Interestingly, the same domain has been used since the very first version of the virus, more than a year ago:
Here you can see how commands received over IRC lead to the first connection, and then two other connections to different sites:
3.2) Return to host
As I said earlier, the actual infection isn't done by an infected executable, but by the code it injects in remote processes. This means that once the code has been injected, the virus will eventually jump back to the host code in order to execute it. That's how real viruses work.
There are two ways for the virus achieve this, depending on how the file got infected. In some cases, the entry point information in the PE header is modified to point to the last section, where the virus resides. In others, the PE header isn't modified, but the application code at entry point is overwritten by a jump to the virus code (still in the last section). In the latter case, just before returning to the host entry point, the virus patches back the old code and jumps to the freshly-patched code. This way, the host application still executes normally. Obviously, the code was only restored in memory, and therefore whenever the application is started again, the virus code takes over before the infected application.
This is how the restoration is done:
Many aspects of the Virut virus have changed, making newer variants much more effective. The fact that it infects running processes makes it very virulent. If you move a file that matches the requirements in the infected code onto an infected machine, it is instantly infected. The virus also uses the SFC functions to make sure Windows won't pop up an error message if a Windows file is infected. The fact that it infects Web pages makes it even more virulent, as Webmasters could and probably do upload infected htm/asp/php pages, leading to various exploits that target their visitors.
Researcher: Nicolas Brulez, Websense Security Labs