Tuesday, March 29, 2011

Razorback - Whats going on?

Its been almost 3 weeks since I joined the VRT and started working on Razorback. Over that time we have made some good progress with the project and I wanted to share what we have done and what we are going to be working on over the next few weeks.
What we have completed so far:
  1. Subversion repository restructure:
    We have restructured the subversion repository in a way that has given us the following:
    • The ability to build components separately with minimal cross project dependencies.
    • The ability to release individual components of the system in separate tarballs, this is geared towards binary package maintainers.
    • The ability to release a jumbo tarball with all of the razorback components in them for rapid deployment.
    More information on the new repository structure can be found the in Developers Guide here: https://sourceforge.net/apps/trac/razorbacktm/wiki/Developers/Repository Layout
  2. Integration of all nuggets from the nugget farm project into the main project:
    All of the nuggets that where in the nuggetfarm project on SourceForge have been pulled into the main project. The aim of is is to make it simpler to maintain the official nuggets. These nuggets are now available in the full release tarball or as individual components.
  3. API Project Improvements:
    • The API has been split out of the dispatcher project to make it easier to maintain.
    • API library symbol visibility - lots of the other components (disptacher and nuggets) required an un-installed build of the API to be available to them so that they could statically link in a sub library that was not installed; the utils library. The should allow people to build components much easier if they have installed the system from packages or from the per component release tarballs.
  4. New/Improved configuration API.
    • We have replaced the hand rolled parser with libconfig (http://www.hyperrealm.com/libconfig/), which has drastically reduced the time that it takes to add configuration items to components.
    • We have also added routines to allow components to use the configuration api to load configuration files that they specify the structure of simply and in a standard fashion. This has allowed us to remove all hard coded configuration items from nuggets and put them into configuration files.
    • The configuration API now looks for configuration files in the configured sysconfdir by default, the API calls allow you to pass custom search locations in if required. This means that you no longer have to run every command with --conf=... which may be a relief to many of you.
    You can read up on the new configuration API here: http://razorbacktm.sourceforge.net/docs/api/trunk/
  5. Doxygen API Documentation:
    We have started using doxygen to generate up to date API documentation and publish it to the project website. Documentation is generated and published every 4 hours for supported branches. Not all files have been fully documented yet but you can find out about what has been here: http://razorbacktm.sourceforge.net/docs/api/trunk/
  6. Continuous integration testing.
    As of the 0.1.5 release we have defined the officially supported platforms to run Razorback on and the architectures that we support for those platforms. These are currently set out as the following base OS’s running on either i386 or amd64/x86_64 hardware:
    • Debian 6.0
    • FreeBSD 8.1
    • RedHat Enterprise Linux 6.0
    • Ubuntu 10.04 LTS
    In order to help maintain compatibility across these platforms and to reduce the amount of times developers spend testing on these platforms we have deployed BuildBot. BuildBot is a continuous integration system that will run a sequence of actions when an event triggers them. Currently we have it setup to build every component on every platform after 15 minutes of idle time in the repository after a commit. In addition to this the system will trigger builds of the API if something that depends on it changes, or of all the things that depend on the API if a change is made to it. You can read more about buildbot here: http://trac.buildbot.net/
  7. System Manual and Developers Guide
    We have started writing better user and developer documentation for the system, with the aim of allowing more people to be able to setup and use the system. This information is available on the project wiki:
    https://sourceforge.net/apps/trac/razorbacktm/wiki
  8. Nugget cleanup:
    We have cleaned up and packaged all of the nuggets so that they are easy to install and simple to configure. Where applicable we have integrated 3rd party libraries and components into the nuggets to make them faster to install.
What's coming next? Here is a short list of the most exciting features being worked on (in no particular order):
  • Complete redesign of the dispatcher.
  • IPv6 Support for inter-component communication .
  • Encryption support for data passing between components.
  • API Improvements for none real time processing.
  • Database improvements.
  • Data block storage and transfer improvements.

Thursday, March 3, 2011

Attack Obfuscation - Not Just For JavaScript

Since his company purchased a Sourcefire IPS setup last summer, I've had a close working relationship with Mickey Lasky, the primary network security analyst at a company (which shall intentionally remain unnamed) that runs a number of public-facing web sites. He sends me PCAPs whenever he runs across something especially weird, and I help him with custom rules in return. Mickey also runs experimental rules for me from time to time, which is quite useful since the network he's protecting is especially busy, and if there's going to be a false positive, it'll show up there.

A couple of weeks ago, he sent me a particularly interesting set of PCAPs, saying that he'd collected them after discovering that a single, determined intruder was busy dropping malware on the web servers he's watching over by uploading PHP code to them via POST requests. By itself, that's not all that exciting; what I found interesting was the way the attacker had obfuscated the requests. In addition to lots of Base64-encoded data, there were large chunks of code that looked like this:
$wWfdGw['_HG3uWD_']=Array('ob'    .  '_en'.'d_flus'.  'h');      $kITFJjggfl=Array();
function    HG3uWD($ownentes83)
{
global  $kITFJjggfl;    $rdupmKoww  =    'c'."hr";
$aaSbVPTgxM   =  $rdupmKoww(98) .   $rdupmKoww(97) .'se'  . 
$rdupmKoww(54)."4_decode";$postimagistes    =  $rdupmKoww($aaSbVPTgxM('MTA=')).   $rdupmKoww(13)
.' '   .   $rdupmKoww($aaSbVPTgxM('MzM='))   .    $rdupmKoww(35)  .    '%'.    $rdupmKoww(38)
.$rdupmKoww($aaSbVPTgxM('NDA=')) .   ')'  .
...
Since the variable names changed from one POST to another - as did the way the code sliced up underlying strings like "chr" or, in other places, "base64_decode" - the question became, is there any generic characteristic across all of these attacks that could be used to write a rule, which would simultaneously not generate massive false positives on normal traffic?

What immediately sprung to mind was the odd spacing surrounding the concatenation operators, or "."s. In normal PHP code, string concatenation generally looks like:
$longvar = $var1 . $var2;
...or:
$longvar = $var1.$var2;
There's no rational reason for a human to surround the "." with more than one space on either side, and certainly not a random number ranging up to five spaces on either side. Automated code generators wouldn't do spacing like that either. That led to an easy rule:
alert tcp $EXTERNAL_NET any -> $HOME_NET $HTTP_PORTS (msg:"WEB-PHP generic PHP code obfuscation attempt"; flow:established,to_server;
content:"|20 20 20 20 2E|"; content:"|20 20 20 20 2E|"; distance:0; classtype:trojan-activity;)
The problem with this rule, we quickly found, was that since some of the web sites being monitored allowed code uploads, CSS files ended up heading towards port 80 on the network being monitored. When those files used spaces instead of tabs for declarations, a la:
.calendar-date-switcher {
They matched the initial signature and caused a bunch of false positives, rendering the rule useless for blocking mode.

Going back to the drawing board, I realized that some of the built-in PHP keywords were never obfuscated in these attacks - in particular, Array(). Since CSS doesn't declare arrays like that, the rule quickly became:
alert tcp $EXTERNAL_NET any -> $HOME_NET $HTTP_PORTS (msg:"WEB-PHP generic PHP code obfuscation attempt"; flow:established,to_server;
content:"Array|28|"; content:"|20 20 20 20 2E|"; within:200; classtype:trojan-activity;)
After 24 hours of testing, Mickey determined that the false positives had been eliminated, and that the rule was still catching the attacker's POST requests, so he turned it on in inline mode. Suddenly the attacks stopped succeeding, and the rule was lighting up his console like a hyperactive pinball machine.

While this same attacker has continued to look for other ways to drop his code on Mickey's systems, I've reached out to other contacts running large production networks, and found that the false positive rate of that rule is essentially none. Armed with that knowledge, we've released it as SID 18493 in today's SEU. Though it's disabled by default, as are other similar obfuscation-detection rules, we would encourage you to give it a shot if you're interested. It may be that this particular technique is confined to this specific attacker, but since the rule is high-performance and apparently high-fidelity, the risk to reward ratio on it seems favorable to us, just in case.