Blog
MOTW: HTML/JS Obfuscation Part II
12.08.2006 - 1:25 PMPrevious Posts
December 2006| 12/18/2006 | Potential Skype worm propagating. » |
| 12/15/2006 | 2027 Security Predictions » |
| 12/13/2006 | 2007 Security Predictions » |
| 12/08/2006 | MOTW: HTML/JS Obfuscation Part II » |
| 12/05/2006 | MOTW: Malware Collection: Passive Honeypots » |
+ November 2006
+ October 2006
+ September 2006
+ August 2006
+ July 2006
Spiders vs. Monkeys
The next step in the de-obfuscation process is to examine how we can safely analyze JavaScript that is difficult to handle using the previously discussed strategies. Our goal is to be able to run the script unchanged, which means that we cannot perform the cut-and-paste cleanup step outlined in Part I. There are several reasons why we might encounter this limitation:
- The script is too long to reformat or clean up by hand
- The script uses advanced obfuscation techniques. One example is using information about the context in which the de-obfuscation routine is evaluated. This context is then used as part of the decoding algorithm. If the routine is reformatted or otherwise changed, then the decoding algorithm no longer operates.
- Heavy use of browser-specific features, such as DOM objects, window/document state, and so forth.
Once again, we will use Mozilla's SpiderMonkey JavaScript engine (http://www.mozilla.org/js/spidermonkey/) as our command-line interpreter.
Hand Over the Document and Nobody Gets Hurt
In order to run a potentially malicious script unchanged, we will have to supply definitions for the objects and variables that it uses to perform its function. This means that, if it uses browser-specific function calls, we have to supply a default implementation of that function that does something useful. If we control the implementation, we can also trap attempts to use the object in a malicious or deceptive way, and these trapped attempts can supply information to the researcher about the potential risk of a given script.
In our example, we supply a default implementation of the function call document.write. This consists of two parts:
- an object state for document.
- implemented methods and properties inside the document object
In JavaScript everything is an object. Functions are objects, so the object declaration is basically a function that contains other functions (methods) and members (properties). In the simplest case we have the following, which contains one method and one property.
// declare the document class
function my_document
{
// a property (initialized to string)
this.m_property="";
this.write=function(string)
{
print("my_document::write");
print(string);
}
};
// declare a globally-accessible document object
var document=new my_document();
This doesn't do a whole lot, yet. There are a couple of things of note that are different from common OO syntax. The body (block) of my_document is the constructor. Statements in the constructor are either property or method declarations, and can be in any order. JavaScript objects have no explicit destructor. By creating the document object before the target script runs, we therefore intercept all calls made by the script through our own object.
A Series of Unfortunate Events
Let's cut to the chase. Our example document, as delivered to us straight from the web page, is a giant block of obfuscated JavaScript in between a script tag. Not exactly well-formed, but browsers will load it anyway. Here is how the document starts:
<script>ZTGKKSSG=unescape;JCCBDOYL=ZTGKKSSG((((""+""+""+"%3c%68%74%6d%6c%3e%0d
%0a%3c%68%65%61%64%3e%0d%0a%3c%73%63%72%69%70%74%20%6c%61%6e%67%75%61%67%65%3d
%22%4a%61%76%61%73%63%72%69%70%74%22%3e%0d%0a%76%61%72%20%78%20%3d%20%30%0d%0a
%76%61%72%20%73%70%65%65%64%20%3d%20%31%30%30%0d%0a%76%61%72%20%74%65%78%74%20
%3d%20%22%u6b61%u8fce%u5149%u81e8%uff3f%u96c4%u4e4b%u7db2"+""))));
document.write(JCCBDOYL);
It goes on for some length in a similar fashion. The original document is split up into random-sized chunks. Each chunk is obfuscated using an aliased call to unescape() to build the individual piece, then that piece is printed out with document.write. Our goal is to intercept all the calls to document.write and get our own copy of the decoded document, (and this is the important part) without changing/reformatting/editing the script in any way. As an added bonus the document is big-5 encoded (Chinese), so there are a few decoding issues we have to worry about if we want to view results at the terminal.
The my_document object therefore needs the following things:
- it must be able to track stats for the write method: how many times it was called, and how many bytes it wrote.
- instead of printing the results right away, our write method should accumulate the results to a string. The string is printed out to the terminal from another method after the script runs.
- when decoding the result, be careful to only escape the unicode/high chars (ASCII >0x7f), so we can view it in the terminal without screwing up the screen.
Given these requirements, here is the document object definition we end up with:
// declare our own document object
function my_document()
{
print("debug: my_document constructed");
this.m_data="";
// counters to keep track of our write implementation
this.m_write_called=0;
this.m_write_bytes=0;
this.write=function (string)
{
print("debug: called my_document::write ("
+string.length+" bytes)");
// only escape high chars (>7bit). all this does really is ensure that
// the string remains readable at the terminal if it contains unicode
// or raw octets.
for (var i=0;i<string.length;i++)
{
if (string.charCodeAt(i)>0x7f)
{
this.m_data+=escape(string[i]);
}
else
{
this.m_data+=string[i];
}
}
this.m_write_bytes+=string.length;
this.m_write_called++;
}
this.myd_dumpstate=function()
{
var state="";
for (var p in this)
{
var p_e=eval("this."+p);
state+=" "+p+" [Type:"+typeof(p_e)+"]\n";
if (typeof(p_e)=="string")
{
state+=" -Value: \""+unescape(p_e)+"\"\n";
}
}
print("debug: my_document::myd_dumpstate:");
print(state);
}
this.myd_data=function ()
{
print("***");
print("Decoded document");
print("Number of calls to document.write: "+this.m_write_called);
print("Total bytes written to document: "+this.m_write_bytes);
print("***");
print(this.m_data);
}
this.myd_reset=function ()
{
this.m_data="";
}
this.myd_dumpstate();
};
var document=new my_document();
This is saved to the file mystubs.js.
After Dinner Cocktails
Two parts of the solution are in place: the document object to intercept the data from the script, and the script itself. Now, how do we access the results?
Make a new file called mypost.js. This is loaded after the first two parts into the command-line interpreter.
// post processing, run this file after the main js file is interpreted.
// print out results
document.myd_data();
So, here is the final order of loading and execution:
- mystubs.js. Contains the definitions for the objects and methods we want to intercept.
- original script. This is the bad stuff you want to evaluate.
- mypost.js. Has post-processing stuff, such as printing the result or the state of objects of interest.
Note that the function my_document::myd_dumpstate will tell us exactly what is inside the object when it is called. This is useful because the script can potentially set properties of a given object or attempt to override sensible defaults. This can be used for risk assessment (example: the script tried to redirect or fool the user by loading into location.href directly, or through some other obfuscated means).
Nukeular Strategery
Time to press the big red button. The command-line invocation to bring it all together, as outlined above:
$ js -f mystubs.js -f badscript.js -f mypost.js
After destruction of the surrounding countryside, here's what we end up with (content changed to protect the perhaps not-so-innocent):
debug: my_document constructed
debug: my_document::myd_dumpstate:
m_data [Type:string]
-Value: ""
m_write_called [Type:number]
m_write_bytes [Type:number]
write [Type:function]
myd_dumpstate [Type:function]
myd_data [Type:function]
myd_reset [Type:function]
debug: called my_document::write (96 bytes)
debug: called my_document::write (86 bytes)
debug: called my_document::write (20 bytes)
debug: called my_document::write (67 bytes)
debug: called my_document::write (86 bytes)
debug: called my_document::write (65 bytes)
debug: called my_document::write (13 bytes)
debug: called my_document::write (100 bytes)
debug: called my_document::write (21 bytes)
debug: called my_document::write (2 bytes)
debug: called my_document::write (9 bytes)
debug: called my_document::write (1 bytes)
debug: called my_document::write (11 bytes)
debug: called my_document::write (88 bytes)
debug: called my_document::write (85 bytes)
debug: called my_document::write (92 bytes)
debug: called my_document::write (94 bytes)
debug: called my_document::write (35 bytes)
debug: called my_document::write (26 bytes)
debug: called my_document::write (23 bytes)
debug: called my_document::write (44 bytes)
debug: called my_document::write (15 bytes)
debug: called my_document::write (5 bytes)
debug: called my_document::write (50 bytes)
debug: called my_document::write (35 bytes)
debug: called my_document::write (1 bytes)
debug: called my_document::write (62 bytes)
debug: called my_document::write (33 bytes)
***
Decoded document
Number of calls to document.write: 28
Total bytes written to document: 1265
***
<html>
<head>
<script language="Javascript">
var x = 0
var speed = 100
var text = "%u6B61%u8FCE%u5149%u81E8%uFF3F%u96C4%u4E4B%u7DB2%u9801
%uFF3F%u5225%u554F%u6211%u662F%u8AB0%uFF3Fhttp://-----.net%uFF3F%u6B61
%u8FCE%u5149%u81E8%uFF3F%u96C4%u4E4B%u7DB2%u9801%uFF3F%u5225%u554F
%u6211%u662F%u8AB0%uFF3Fhttp://-----.net"
var course = 10
var text2 = text
function Scrollforward() {
window.status = text2.substring(0, text2.length)
if (course < text2.length) {
setTimeout("Scrollback()", speed)
}else {text2 = " " + text2
setTimeout("Scrollforward()", speed);
}}function Scrollback() {
window.status = text2.substring(x, text2.length)
if (text2.length - x == text.length) {text2 = text
x = 0
setTimeout("Scrollforward()", speed);}else {x++
setTimeout("Scrollback()", speed);}}Scrollforward()
</script>
<script LANGUAGE=javascript>
<!--
if (top.location != self.location)self.location=window.open('http://-----.net/');
// -->
</script>
<meta name="Keywords" content="%u96C4%u4E4B%u7DB2%u9801%uFF3F%u5225%u554F
%u6211%u662F%u8AB0%uFF3F%u672C%u7AD9%u91DD%u5C0D%u7559%u8A00%u677F%u800C%u8A2D
%uFF3F%u63D0%u4F9B%uFF1A%u7559%u8A00%u677F%u8A9E%u6CD5%u3001%u6559%u5B78%u53CA
%u7D20%u6750%u3001%u5C11%u91CF%u7533%u8ACB%u3001%u5C11%u91CF%u4E0B%u8F09%u3002
%u7B49%u7B49%uFF0E%uFF0E%uFF0E">
<META HTTP-EQUIV="content-type" CONTENT="text/html; charset=big5">
<link rel="SHORTCUT ICON" href="-----.ico">
<link rel="Bookmark" href="-----.ico">
<title>%u3000%u3000%u96C4%u4E4B%u7DB2%u9801%u3000%u3000%u5225%u554F%u6211
%u662F%u8AB0%u3000%u3000%u672C%u7AD9%u53EA%u91DD%u5C0D%u7559%u8A00%u677F%u800C
%u8A2D%u3000%u3000%u3000%u3000</title>
</head>
<FRAMESET cols="*,88" frameborder="no" border="0">
<frame name="pagetwo" src="ABC.html">
<frame name="pageone" src="CBA.html" SCROLLING="no" noresize>
</frameset>
</html>
Conclusions and Gifts of Money
The methodology as outlined here is very basic, but a more complete version of it could be used for a degree of automated security analysis of JavaScript code. By hooking directly into the object model as opposed to trivial substring searches, we have the possibility to gain complete insight into the behavior of a given piece of JavaScript, no matter how obfuscated. Much more advanced strategies, such as hooking into the JavaScript interpreter directly, could be used for even greater effect. This is a rich area for further research.
And finally, by allowing us to evaluate the obfuscated code without requiring any reversing or cut-and-paste cleanup, we can drastically reduce the time to results. When you are a security researcher, time is definitely not on your side, so let's score one pint/point for the good guys.
Researcher: NJ Verenini
Post a Comment:







