Archived Blog

Gumblar - An Analysis and History

05.21.2009 - 11:00 AM

Websense Security Labs™ has been tracking the Gumblar (Gumblar.cn/Martuz.cn, Troj/JSRedir-R) threat and its rise to infamy for some time now. There are some very interesting facts about this threat which make it unique and news worthy. The main headline grabber is that this is one of the top threats being seen today, and has become very widespread in a relatively short amount of time.

A Little History

Before there was gumblar.cn or martuz.cn, there was 78.110.175.249 and 94.247.2.195 which we have classified as Malicious since 2009-03-08 and 2009-04-02 respectively. These IPs were (and still are) involved in a mass compromise attack pointing to IP/jquery.js which peaked on 2009-04-25 at approx 17,000 compromised sites.

Why is this history to the Gumblar attack? Well, looking at the injected JavaScript you can see clear similarities between the Gumblar attacks and these older, less sophisticated ones. The main similarities are that they both use some form of URI escaped values mixed in with random characters that are removed using the JavaScript replace() function. However, that wasn't enough to link the two attacks until we saw the 94.247.2.195/jquery.js pages starting to point to 94.247.2.195/news/?id=. This URL is the predecessor to gumblar.cn/rss/? pages, and similarities are evident in its structure, style, payload, and techniques.

Screenshot of 94.247.2.195/news/?id=100:

The JavaScript Injection

The Gumblar attacks take the injected JavaScript code a little further than its predecessor. In its injection it also includes some code to determine browser type. The variations of the injected code also seem to have increased, which is an obvious step to evade detection from security solutions.

Screenshot of Injected Code:

Screenshot of Deobfuscated Injected Code:

The Destination Page

The destination page that you are redirected to serves up different versions of the malicious content. It's not clear if this just happened to be because the malicious users are constantly changing the pages, or if they have a randomizer built into their server-side code to intentionally serve it randomly each and every time. There are three main parts to the page:

  1. Large Array named 'a' with numeric values
  2. An eval(unescape().replace()) piece of code
  3. A function named 'ttt()' which just returns a large string

Screenshot of Array gumblar.cn/rss/?id=568820:

The above screenshot is of the beginning of the source code for one of the pages served by gumblar.cn. You can see that it's very similar to the source of the 94.247.2.195 page shown earlier. These numeric values range from single digit values up to three digit values in the 100s. This is where the page stores its obfuscated contents.

Screenshot of eval(unescape().replace()) piece of code from gumblar.cn/rss/?id=568820:

Deobfuscated and formatted output of above before eval:

var s="",k=0,u="nOAkxkR7orNyD5dSZ4pst"; for(i=0;i<a.length;i++) { s+=String.fromCharCode(a[i]^u.charCodeAt(k)); if((++k)>=u.length)k=0; } eval(s);

As you can see, the purpose of the eval(unescape().replace()) piece of code was to deobfuscate itself by doing a replace of obfuscation characters that were inserted, and then unescaping escaped character sequences. Once it has deobfuscated itself, the outer eval gets called to run that code. In this deobfuscation routine we see that the variable 'u' is storing a string of characters that are used in an xor operation with the values in the 'a' Array. The 'u' string has its characters converted to Unicode numeric values, which are then used in the xor operation. So for example, the first character in the 'u' variable is 'n' and if you run String.charCodeAt('n'); in a JavaScript interpreter you should get '110'. If we xor this with the first value in the 'a' Array which is '10' (String.fromCharCode(10 ^ 110);) we get 'd'.

Screenshot of partial deobfuscated and formatted page:

The function 'ttt()' only gets called if an object can be created for either "ADODB.Stream" or "Scripting.FileSystemObject". Furthermore, when it's called it is done so as 'var d = b6(ttt());'. The 'b6()' function is shown in the deobfuscated screenshot above, and anyone familiar with Base64 encoding/decoding can recognize the code right away. So that means 'ttt()' returns a Base64 string which gets decoded by function 'b6()'.

Deeper Look at 'b6()' and 'ttt()'

When looking at the 'b6()' function and its code, something jumps out: the Base64 key is scrambled. Usually the key looks like 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=' but in the case of 'b6()' they scrambled it. I ported the code in 'b6()' to Perl and ran it on the same contents that were on the page. I then wrote the Base64 decoded output to a file. I have included the Perl code here (rename to gumblar.pl) and you need to supply command line arguments for the location of a file that contains the Base64 contents of 'ttt()' and the key to use. Example: ./gumblar.pl gumblar.txt DkexZd6UAw7j2PWB3LJKm8vQtbENY/XfusSG9VITCi1FRoa=ygM4pq5hl+rz0OcnH.

Output of running the *nix file command on the file:

gumblar_file: MS-DOS executable PE for MS Windows (GUI) Intel 80386 32-bit, UPX compressed

So 'ttt()' stores a full piece of malware Base64 encoded - although this is not the first time we have seen malicious pages do that, it's definitely not common. By doing this the malicious authors can hide their binary as text when it's travelling across the wire, and then have it decoded at the client's browser.

Chart of Older Injection and Matches:

Chart of Gumblar Injection and Matches:

Chart of Gumblar vs Older Injection and Matches:

As you can see from the charts above, the number of compromised sites is very high and also growing at an alarming rate. It seems that the attackers behind this attack planned the injection very well so they could maximize the number of infected sites in a short amount of time. It has been mentioned by ScanSafe that the injections are mostly a result of FTP credential sniffing by Malware, but server-side exploits must have been used as well. As we continue to monitor this attack, we will publish any interesting findings.

Security Researcher: Ali Mesdaq
Bookmark This Post: