Saturday, May 18, 2013

De-obfuscating some JavaScript malware

Is Antivirus (AV) snake oil? Well, I won't go deep into this right now, but I need to provide some quick background information. Since about ten years I was using Avira Free. It worked quite well and blocked some threats on family members' computers, but never anything on mine. About a year ago I read somewhere "If your AV didn't block anything in the last year, you probably don't need it." That convinced me, because it only blocked some false-positives in the about ten years I had it installed. Actually I'm not afraid of getting any malware through social engineering or by opening something bad, but I am afraid of getting malware through some 0-day or a targeted attack. So about a year ago, I didn't reinstall any AV when re-installing my computer. I installed EMET though and also Windows has the built-in Windows Defender. That's sufficient for me. I'm not talking about normal consumers.

About three weeks ago, when looking for a hotel, I was surprised to get Windows Defender taking action, just by opening the start page of the hotel:

Windows Defender in action
We all know the drive-by exploits and stuff, but with me running Internet Explorer 10, with Enhanced Protected Mode enabled (it's off by default) and the system fully patched, why would there be a dangerous threat that Defender should block? That's interesting and I started to dig into this. So where do we start? View the HTML page source of course.

First disappointment. What I saw was this:

View Source - lots of includes
So it's full of includes. Should I look through each of them and examine everything? Well, scrolling a little down, I quickly saw something that caught my attention:

suspicious code
IE's source code viewer made this into a nice multi-line view, but actually this is just one long line of code.

Ok, so we have some obfuscated JavaScript here. While there are automated de-obfuscators on the Internet readily available (like this one), let's do this manually. It's more fun anyway. I'm not afraid of any JavaScript running in my well sandboxed browser, even if it's IE, so why should this be something to block? I want to know the details. So let's start.

In order to do something with this, we have to disable Windows Defender. Even if we save this in a Notepad text file, Defender kicks in and deletes the file. In Windows 8, Defender is no longer in Control Panel Category View, it's hidden. You have to switch to the old Icon View in order to see it. Ok, let's disable Defender and let's start.

The first obvious thing here is that there's a lot of white space after the initial script tag. That's probably to hide itself, because on systems that show this source code in one line, the rest is invisible, scrolled out on the right. Ok, let's remove these spaces. Then there is a big string, with some strange codes in them. In order to understand what this does, I just removed the content of the string in order to see the rest of the code. So it looks like this:

obfuscated code
So finally we have something to de-obfuscate. First of all, let's add some line breaks and indentation. Then it looks like this:

nicely formatted obfuscated code
So let's remove the obfuscation. We'll do the following:
  • zz=3; ... if(zz)... → remove this, because it's always true
  • dbshre=53; if(dbshre)... → remove this, because it's always true
  • ss=(123)?String.fromCharCode:0; → replace this with ss=String.fromCharCode;
  • asgq variable → rename it to code_1
  • p=parseInt; ... p(...) → use directly parseInt(...)
  • ss=String.fromCharCode; ... ss(...) → use directly String.fromCharCode(...)
  • gdsgsdg variable → rename it to exc1, as it is an exception
  • agdsg variable → rename it to exc2, as it is an exception
  • vfvwe variable → rename it to flag, as it is set to 0 or 1
After all this, it already looks a lot cleaner:
after first step of de-obfuscation
But there's still some cleanup left. Let's continue:
  • try{document.body} → the document.body is an object HTMLBodyElement. Applying the Bitwise AND Operator with a Number causes an invalid argument exception and will get catched by the catch(exc1){...} part. So leave away the entire first part. I assume this was written to confuse automated de-obfuscating tools.
  • if(window.document) → remove this, because it's always true (twice in the code)
  • try{document;}catch(exc2){flag=1;} → This is not doing anything. Not even throwing an exception. Remove the entire code part.
  • flag=0; ... if(!flag)... → remove this, because it is always true
  • e=eval; ... e(s); → use directly eval(s);
Ok, so we have the code de-obfuscated now:

de-obfuscated JavaScript code
Actually we could leave away the { } for the for-loop too. Anyway, looking at this, it seems to be clear now what this code does. First, it replaces the at-character in the string with the digit 9 and then splits the string by the exclamation mark. These hex values between the exclamation mark get with the parseInt converted to a number and then to a character. So these are all ASCII codes. Finally they get concatenated to a string, which then gets executed with the final eval command. So the question is, what is in this string that gets executed?

We could simply replace the final eval(s); with an alert(s);, but that wouldn't be nicely formatted and not ready for copy and further examination. So let's use the string and do it manually.

I opened Notepad and replaced
  • @ → 9
  • !a! → !0a!
  • !0d! → !0d!
  • ! → space character
So we get a list of hex codes:

hex data for second stage code
So to continue, we could convert each character manually with an ASCII table or write a program to do it. I used one of the online converters for that. After that conversion to text we get this:
second stage code
So this second stage code is not obfuscated, but badly formatted. So after formatting and renaming the main function from zzzfff to mainfunc we get this:

second stage code, formatted

So finally we can start analyzing it. So we have three functions and a small code block. This code block runs right away and executes GetCookie. From the nice name they left in there we can assume it reads a cookie (which it does if you look at the code of GetCookie) and if the cookie is found, nothing else is done. If the cookie is not there, then the other function with the nice name SetCookie will be called. This simply stores a cookie. We don't need to go into the details of these two functions, only that SetCookie sets an expiration time of today.getTime()+1day (the parameter to setTime is in milliseconds since a fixed date). This is to execute the mainfunc() only once per day. After SetCookie, we get to the main function, the core of this "malware".

So what does mainfunc() do? First it creates an object jn, which is an iframe with 1 pixel size and without a border and pointing to some external URL. Then the code checks if the page already has an HTML element with the id 'jn'. If not, it writes (document.write) at the current position a div tag with the id 'jn' and adds the iframe object into it.

This means that all this malware does is to inject a div tag with an iframe object that loads in the iframe the content of another site, presumably with malicious code in there exploiting some unpatched browser or plugin vulnerability. So per se this code is nothing malicious at all. It would be the same as having an iframe tag on the page itself. It was just hidden in some obfuscated code.

The mentioned URL doesn't work anymore, it results in a 404 (not found). But at the time of testing this, it still worked. But I couldn't get it to serve me anything. It just returned an "ok" text. I thought it does maybe some browser fingerprinting by checking the User Agent, so I tried various strings there, even of old browsers, but I never received anything. So maybe it serves the malware only to certain IP ranges (country specific). Anyway, this php serving the malware and anything it returns would be part of a further blog post. This write up was just for the injection JavaScript.

Some general thoughts: I thought the variable names in the obfuscated code were to distract AV detectors and they were always different. It seems that this is not the case and they are always the same in this variant of the "malware", only the URL in the contained second stage code varies and the length values varies (in the for loop) and also some assigned value for the useless variables. Then the second stage code seems to be not minified nor obfuscated in any way. It even contains useful function and variable names. Formatting could've been better, but the programmers didn't care of shortening the code for unknown reasons. For me all this looks like they were not very professional, more like some script kiddies or someone using tools.

Additionally the same injected code seems to appear on websites all over the world. The funny thing is that obviously this was an automated attack on these sites, probably exploiting some common bug, as for some, the injected code doesn't even work and the JavaScript is at a place where it gets displayed instead of executed, like here:
injection at the wrong place
If you look at the source here:
source code of wrong place injection
You can see here that our JavaScript was added in the middle of the meta tag. That doesn't work of course. Why would some automated tool put it there? For me this looks more like someone being directed to do that and being told "put it right after the line with head and meta tags", not even understanding HTML. We can also see that in this example WordPress 3.5.1 was used.

Searching Google for some code parts of the initial obfuscated code results in "about 170,000 results", including a few discussions about this code. On only a few of these pages I got this nice warning from Google:

Google warning
This specific page had our JavaScript code 13 times on the same page at different positions. Plus at least one other one, so I'm not 100% sure which one Google refers to. Anyway, nice to get a warning - and there's no link to the page; you cannot simply click "continue anyway."

One interesting aspect of this is that Google or your AV might block this initial iframe injection, but the underlying iframe source is already down. This also counts in their statistics of "having successfully blocked malware from innocent users", which is wrong of course. Blocking an iframe prevents nothing by itself, but it's a good mitigation.

I looked through the first three pages of Google results (first 30 results). Some of these have slightly different code, probably some variations, and in many cases the GetCookie/SaveCookie part is missing (so it's served always). The one we examined is probably newer. Looking at the second stage code on these pages results in 18 unique links for the malicious iframe. These are the links I found there, including the ones mentioned above (http replaced with hxxp to avoid hotlinking). First the URLs that are down or not reachable:
  • hxxp:// - 404 down
  • hxxp:// - 404 down
  • hxxp:// - 404 down
  • hxxp:// - 404 down
  • hxxp://clutte[p..z]... (URL was cut off there)
  • hxxp:// - 404 down
  • hxxp:// - 404 down
  • hxxp:// - 404 down
  • hxxp:// - 403 forbidden
  • hxxp:// - 404 down
  • hxxp:// - 404 down
  • hxxp:// - DNS error
  • hxxp:// - 404 down
  • hxxp:// - 404 down
  • hxxp:// - 404 down
And these four are still working by the time of this writing:
  • hxxp:// - up, served by IIS6
  • hxxp:// - 301 redirect to www site, still up, served by Apache
  • hxxp:// - up, served by Apache/2.2
  • hxxp:// - up, served by Apache/2.2
So these sites are probably all hacked and serve this malicious iframe. If you don't know what to do with your time, you could write some script to query Google for some indicators of this initial JavaScript code, get the results into some database, query all these pages, extract this JavaScript, automatically de-obfuscate it to get the URL from the second stage code into a second database table, then query all these URLs and list those that are still up and running. As a AV vendor I would do that.

For the four that are still running, the result looks like this when querying:

HTTP stream connecting to the malicious URLs
Here I'm connecting with IE10-64 directly, but I tried other User-Agents as well. I always get one of the two following results (depending on URL):
  • 2_ok_0 (with "_" standing for a line break)
  • ok
Having the php source code of that page would help a lot of course, but I won't hack into those servers just for this reason. I might get associated to someone exploiting their server if I would try that.

When googling for this, I found some Intrusion Detection System (IDS) log file saying that it "detected malicious iframe injection" and also "detected BlackHole v2.0 exploit kit URL pattern", so they might be related.

Although we didn't get any deeper yet, I still hope you liked this write up.

No comments:

Post a Comment