Thursday, April 26, 2012

ClamAV vs. Content IQ Test, part 3

This is the third post in a series of blog posts about the Content IQ Test. Please see ClamAV vs. Content IQ Test, part 1 and ClamAV vs. Content IQ Test, part 2.

Today we look at how ClamAV would handle detecting the target string when embedded in polymorphic files. If you were to compute the MD5 checksum of these test files, you'd see that not two are the same.

Test file 17 contains the target string in text file contained in a polymorphic zip file.

[email protected]:~/Downloads$ clamscan -d test.ndb Test_File_17_Polymorphic_Zip_File.zip
Test_File_17_Polymorphic_Zip_File.zip: TestSig1.UNOFFICIAL FOUND

----------- SCAN SUMMARY -----------
Known viruses: 1
Engine version: 0.97.4
Scanned directories: 0
Scanned files: 1
Infected files: 1
Data scanned: 0.00 MB
Data read: 0.00 MB (ratio 0.00:1)
Time: 0.011 sec (0 m 0 s)
[email protected]:~/Downloads$ clamscan -d test.ndb Test_File_17_Negative_Control
Test_File_17_Negative_Control: OK

----------- SCAN SUMMARY -----------
Known viruses: 1
Engine version: 0.97.4
Scanned directories: 0
Scanned files: 1
Infected files: 0
Data scanned: 0.00 MB
Data read: 0.00 MB (ratio 1.00:1)
Time: 0.009 sec (0 m 0 s)

Without any problems, ClamAV scans the zip archive, then extracts its contents and scans them. In this case, there were 3 files named Test_File_17_MYYndDNqBllL.txt, Test_File_17_TNBjFqvcNFee.txt and Test_File_17_tRmkMMCCDuuF.txt. ClamAV identified the target string in Test_File_17_MYYndDNqBllL.txt.


Test file 18 contains the target string a text file buried within several levels of Zip archives.

[email protected]:~/Downloads$ cat test.ndb 
TestSig1:0:*:6576616c{-200}756e657363617065{-200}282725363525373625363925366325323825323927
[email protected]:~/Downloads$ clamscan -d test.ndb Test_File_18_Multilevel_Polymorphic_Zip_File
Test_File_18_Multilevel_Polymorphic_Zip_File: TestSig1.UNOFFICIAL FOUND

----------- SCAN SUMMARY -----------
Known viruses: 1
Engine version: 0.97.4
Scanned directories: 0
Scanned files: 1
Infected files: 1
Data scanned: 0.02 MB
Data read: 0.09 MB (ratio 0.17:1)
Time: 0.017 sec (0 m 0 s)
[email protected]:~/Downloads$ clamscan -d test.ndb Test_File_18_Negative_Control 
Test_File_18_Negative_Control: OK

----------- SCAN SUMMARY -----------
Known viruses: 1
Engine version: 0.97.4
Scanned directories: 0
Scanned files: 1
Infected files: 0
Data scanned: 0.42 MB
Data read: 0.12 MB (ratio 3.38:1)
Time: 0.062 sec (0 m 0 s)


With our signature, ClamAV detects that presence of the target string in the text file contained in a zip file contained in a zip file contained in a file contained in a zip file. This is what I mean:

Test_File_18_Multilevel_Polymorphic_Zip_File
|____Test_File_18_imoYSCAxenHA.zip
|____Test_File_18_GKTpHFrEVhPB.zip
|____Test_File_18_okTnanDKaYNd.zip
|____Test_File_18_qKzdSzFAMafI.zip
     |____Test_File_18_INkwmLoZzlSr.zip
     |____Test_File_18_HAWdWhuUwIPe.zip
          |____Test_File_18_BCFeZtUZuyjM.zip
          |____Test_File_18_kpAusgeGkKba.zip
          |____Test_File_18_AgixVmtpbAAN.zip
          |____Test_File_18_AYXLMdygOPXV.zip
               |____Test_File_18_AAZcByKsqjWN.txt
               |____Test_File_18_UcLAagbhtANi.txt
               |____Test_File_18_gTKfBOaOdKSK.txt   <-- contains target string


Test file 19 contains the target string in a Flash (SWF) file embedded in a polymorphic Zip file. To detect this string, we use a feature of ClamAV that is current undergoing testing and is not available in the latest stable release. You will need to download the development release, and uncomment the following in libclamav/scanners.c before compiling:

case CL_TYPE_SWF:
            if(DCONF_DOC & DOC_CONF_SWF)
                ret = cli_scanswf(ctx);

            break;

We run clamav-devel/clamscan/clamscan with the option --leave-temps. ClamAV "sees":

What ClamAV sees for test file 12

We go ahead and scan Test file 19:

[email protected]:~/Downloads$ cat test.ndb 
TestSig1:0:*:6576616c{-200}756e657363617065{-200}282725363525373625363925366325323825323927
TestSig:4:*:64616e69656c627261766e6d
[email protected]:~/Downloads$ ~/Programs/clamav-devel/clamscan/clamscan -d test.ndb Test_File_19_Ts_in_Swf_in_Polymorphic_Zip.zip 
Test_File_19_Ts_in_Swf_in_Polymorphic_Zip.zip: TestSig1.UNOFFICIAL FOUND

----------- SCAN SUMMARY -----------
Known viruses: 2
Engine version: devel-clamav-0.97-434-gd510390
Scanned directories: 0
Scanned files: 1
Infected files: 1
Data scanned: 0.01 MB
Data read: 0.02 MB (ratio 0.33:1)
Time: 0.035 sec (0 m 0 s)
[email protected]:~/Downloads$ ~/Programs/clamav-devel/clamscan/clamscan -d test.ndb Test_File_19_Negative_Control.zip 
Test_File_19_Negative_Control.zip: OK

----------- SCAN SUMMARY -----------
Known viruses: 2
Engine version: devel-clamav-0.97-434-gd510390
Scanned directories: 0
Scanned files: 1
Infected files: 0
Data scanned: 0.07 MB
Data read: 0.03 MB (ratio 2.57:1)
Time: 0.046 sec (0 m 0 s)

We successfully detect the target string. This is how Test file 19 was organized:


Test_File_19_Ts_in_Swf_in_Polymorphic_Zip.zip
|____Test_File_19_danIsVGoyReO.zip
|____Test_File_19_aSueIXzLOWMg.zip
|____Test_File_19_eGqvUDOwatPF.zip
     |____Test_File_19_pCvytqMyyBQy.zip
     |____Test_File_19_EkxSSRNNmJnq.zip
          |____Test_File_19_cwMLACFrAhxm.bin
          |____Test_File_19_FIpZyWddMazx.bin
          |____Test_File_19_MKPqAHCkwZUY.bin <-- SFW containing target string

Test file 20 contains the target string in a recently compiled executable file.

[email protected]:~/Downloads$ cat test.ndb 
TestSig1:0:*:6576616c{-200}756e657363617065{-200}282725363525373625363925366325323825323927
[email protected]:~/Downloads$ clamscan -d test.ndb Test_File_20_Recently_Compiled_Executable.exe 
Test_File_20_Recently_Compiled_Executable.exe: OK

----------- SCAN SUMMARY -----------
Known viruses: 1
Engine version: 0.97.4
Scanned directories: 0
Scanned files: 1
Infected files: 0
Data scanned: 0.02 MB
Data read: 0.02 MB (ratio 1.00:1)
Time: 0.030 sec (0 m 0 s)

Using the signature we wrote, ClamAV was not able to alert on the presence of the target string. Let's look at a snippet of the executable's code:

000025d0  6e 00 00 19 47 00 72 00  65 00 65 00 74 00 69 00  |n...G.r.e.e.t.i.|
000025e0  6e 00 67 00 73 00 21 00  0a 00 0a 00 00 4d 65 00  |n.g.s.!......Me.|
000025f0  76 00 61 00 6c 00 28 00  75 00 6e 00 65 00 73 00  |v.a.l.(.u.n.e.s.|
00002600  63 00 61 00 70 00 65 00  28 00 27 00 25 00 36 00  |c.a.p.e.(.'.%.6.|
00002610  35 00 25 00 37 00 36 00  25 00 36 00 39 00 25 00  |5.%.7.6.%.6.9.%.|
00002620  36 00 63 00 25 00 32 00  38 00 25 00 32 00 39 00  |6.c.%.2.8.%.2.9.|
00002630  27 00 29 00 29 00 0a 00  0a 00 01 5f 49 00 66 00  |'.).)......_I.f.|
00002640  20 00 74 00 68 00 69 00  73 00 20 00 68 00 61 00  | .t.h.i.s. .h.a.|
00002650  64 00 20 00 62 00 65 00  65 00 6e 00 20 00 61 00  |d. .b.e.e.n. .a.|
00002660  20 00 72 00 65 00 61 00  6c 00 20 00 61 00 74 00  | .r.e.a.l. .a.t.|
00002670  74 00 61 00 63 00 6b 00  2c 00 20 00 79 00 6f 00  |t.a.c.k.,. .y.o.|
00002680  75 00 27 00 64 00 20 00  62 00 65 00 20 00 6f 00  |u.'.d. .b.e. .o.|
00002690  77 00 6e 00 65 00 64 00  21 00 01 15 47 00 72 00  |w.n.e.d.!...G.r.|

The target string is present in executable, only Unicode encoded. Therefore, we can rewrite create a new signature that we will call TestSig2 that will continue to detect the target string in all the test files we've looked at so far and additionally detect the unicode-encoded target string in files.

Here's TestSig2 and it's decoding as provided by sigtool --decode-sig (sigtool ships with ClamAV):


TestSig2;Target:0;(0|1);6576616c{-200}756e657363617065{-200}282725363525373625363925366325323825323927;6500760061006c{-200}75006e006500730063006100700065{-200}2800270025003600350025003700360025003600390025003600630025003200380025003200390027
VIRUS NAME: TestSig2
TDB: Target:0
LOGICAL EXPRESSION: (0|1)
 * SUBSIG ID 0
 +-> OFFSET: ANY
 +-> DECODED SUBSIGNATURE:
eval{WILDCARD_ANY_STRING(LENGTH<=200)}unescape{WILDCARD_ANY_STRING(LENGTH<=200)}('%65%76%69%6c%28%29'
 * SUBSIG ID 1
 +-> OFFSET: ANY
 +-> DECODED SUBSIGNATURE:
eval{WILDCARD_ANY_STRING(LENGTH<=200)}unescape{WILDCARD_ANY_STRING(LENGTH<=200)}('%65%76%69%6c%28%29'

We see that the 2 substrings detect the same this. The only difference is that one detects the ASCII-encoded test string whereas the other detects the unicode-encoded string.

Let's see how ClamAV does with this new signature:


[email protected]:~/Downloads$ cat test.ldb
TestSig2;Target:0;(0|1);6576616c{-200}756e657363617065{-200}282725363525373625363925366325323825323927;6500760061006c{-200}75006e006500730063006100700065{-200}2800270025003600350025003700360025003600390025003600630025003200380025003200390027
[email protected]:~/Downloads$ clamscan -d test.ldb Test_File_20_Recently_Compiled_Executable.exe 
Test_File_20_Recently_Compiled_Executable.exe: TestSig2.UNOFFICIAL FOUND

----------- SCAN SUMMARY -----------
Known viruses: 3
Engine version: 0.97.4
Scanned directories: 0
Scanned files: 1
Infected files: 1
Data scanned: 0.02 MB
Data read: 0.02 MB (ratio 1.00:1)
Time: 0.040 sec (0 m 0 s)

[email protected]:~/Downloads$ clamscan -d test.ldb Test_File_20_Negative_Control.exe 
Test_File_20_Negative_Control.exe: OK

----------- SCAN SUMMARY -----------
Known viruses: 3
Engine version: 0.97.4
Scanned directories: 0
Scanned files: 1
Infected files: 0
Data scanned: 0.02 MB
Data read: 0.02 MB (ratio 1.00:1)
Time: 0.011 sec (0 m 0 s)


ClamAV detected the unicode-encoded string, while it remains able to detect the ASCII-encoded target string.


Finally, Test file 21 contains the target string a recently compiled executable file embedded in a polymorphic Zip file.


Just like for tests file 20, test file 21 contains several levels of embedded archives before we got to the executable:

Test_File_21_Recently_Compiled_Executable_Zipped.zip
|____Test_File_21_rYorsJjKqXSK.zip
|____Test_File_21_pTFjcGqFOlyq.zip
     |____Test_File_21_kLRkHHudAQfc.bin
     |____Test_File_21_PjCefACftYOr.bin
     |____Test_File_21_MXirTDkPArWW.bin <-- Executable containing target string

ClamAV again is able to scan the archive and all the files that are embedded in it in order to successfully detect the presence of the unicode-encode target string inside of the executable:


[email protected]:~/Downloads$ cat test.ldb
TestSig2;Target:0;(0|1);6576616c{-200}756e657363617065{-200}282725363525373625363925366325323825323927;6500760061006c{-200}75006e006500730063006100700065{-200}2800270025003600350025003700360025003600390025003600630025003200380025003200390027
[email protected]:~/Downloads$ clamscan -d test.ldb  Test_File_21_Recently_Compiled_Executable_Zipped.zip
Test_File_21_Recently_Compiled_Executable_Zipped.zip: TestSig2.UNOFFICIAL FOUND

----------- SCAN SUMMARY -----------
Known viruses: 3
Engine version: 0.97.4
Scanned directories: 0
Scanned files: 1
Infected files: 1
Data scanned: 0.02 MB
Data read: 0.01 MB (ratio 1.33:1)
Time: 0.031 sec (0 m 0 s)
[email protected]:~/Downloads$ clamscan -d test.ldb  Test_File_21_Negative_Control 
Test_File_21_Negative_Control: OK

----------- SCAN SUMMARY -----------
Known viruses: 3
Engine version: 0.97.4
Scanned directories: 0
Scanned files: 1
Infected files: 0
Data scanned: 0.04 MB
Data read: 0.01 MB (ratio 3.00:1)
Time: 0.013 sec (0 m 0 s)

In the next post I'll take a look at how ClamAV does against test files with VBA content.

Tuesday, April 17, 2012

Prototyping Mitigations with DBI Frameworks

A couple weeks ago I had the privilege of both attending my first Austin Hackers Association meeting and speaking at the first Infosec Southwest conference in Austin, Texas. I had been wanting to visit Austin for several years now and was excited to see the dynamics of the local hack scene since Austin is home to several world class vulnerability research teams. I was not disappointed and I had the chance to have several great conversations on low level research topics such as PCI Bootkits and the particulars of Java’s JVM translation for instructions used in JIT spray.

My talk was about prototyping mitigations with existing dynamic binary instrumentation frameworks. It was a random side project that I had decided to check out since I’ve spent a lot of time developing dynamic analysis tools recently and I have always had an interest in mitigation design. I also read plenty of academic materials which are full of great ideas but rarely provide code implementations so I felt there is a need for a prototyping environment. I initially thought I might compare and contrast some of the features that are available in PIN, DynamoRIO, and Valgrind but I felt the comparison would be of less interest to the security community and there was plenty to cover with a discussion of return oriented programming and just-in-time spray exploitation techniques, proposed mitigations for each, and also example code implementing those mitigations.

The reason I felt this was an interesting topic is because nearly all mitigations that are currently released are developed by the vendor themselves. This may be the operating system vendor, compiler vendor, or application developer in the case of sandboxes or custom heaps. It would be nice to test out custom mitigations as point fixes in critical environments until the vendor is able to deploy resistances to known mitigation bypasses such as ROP shellcode techniques. In some instances the vendor will determine that the potential cost in performance or stability will not be worth the benefit of developing mitigations at all. In this case, there are no other options than to be able to develop the mitigations ourselves.

For those unfamiliar with dynamic binary instrumentation frameworks such as DynamoRIO, PIN, and Valgrind; they can be considered to be in-process debuggers. They are able perform the program loading themselves and hook into various points of the program such as functions, basic blocks, and instructions and typically provide an API for abstracting the underlying CPU architecture. Uses outside of computer security include optimization, binary translation, profiling, etc. In the case of computer security, it provides a method for efficient debugging by way of eliminating CPU context switches for breakpoints as well as a nice API for injecting code or inspecting a running application. Performance numbers show these frameworks offer the quickest form of instruction or block tracing, however this of course relies upon the individual hook functions themselves.

RETURN ORIENTED PROGRAMMING

Return oriented programming (ROP) will be the first exploitation technique that we will attempt to mitigate. ROP is the modern term for a technique first pioneered on UNIX platforms that supported non-executable pages of memory. ROP achieves controlled shellcode execution by building a fake callstack and then hijacking the stack pointer. Each frame of the attacker controlled callstack performs a primitive programming operation such as arithmetic or memory store and load operations followed by a RET instruction. The following fake callstack shows three calls chained together to perform a Write-4 operation.
rop += "\xD2\x9F\x10\x10“                   #0x10109FD2 :
                                            # POP EAX
                                            # RET
rop += "\xD0\x64\x03\x10“                   #0x100364D0 :
                                            # POP ECX
                                            # RET
rop += "\x33\x29\x0E\x10“                   #0x100E2933 :
                                            # MOV DWORD PTR DS:[ECX], EAX
                                            # RET

By chaining several calls together a complete shellcode stub to change memory permissions of a page containing a larger payload can be executed. This is typically achieved by calling VirtualProtect, VirtualAlloc, HeapCreate, or WriteProcessMemory functions. The below example shows how a complete call to VirtualProtect would be built by first preparing the arguments on the stack and then executing a call chain to find kernel32 and resolve a pointer to the VirtualProtect function.
########## VirtualProtect call placeholder ##########
rop += "\x41\x41\x41\x41"                   # &Kernel32.VirtualProtect() placeholder
rop += "WWWW"                               # Return address param placeholder
rop += "XXXX"                               # lpAddress param placeholder
rop += "YYYY"                               # Size param placeholder
rop += "ZZZZ"                               # flNewProtect param placeholder
rop += "\x60\xFC\x18\x10"                   # lpflOldProtect param placeholder
                                              0x1018FC60 {PAGE_WRITECOPY}
rop += rop_align * 2

########## Grab kernel32 pointer from the stack, place it in EAX ##########
rop += "\x5D\x1C\x12\x10" * 6               #0x10121C5D : 
                                            # SUB EAX, 30
                                            # RETN
rop += "\xF6\xBC\x11\x10"                   #0x1011BCF6 : 
                                            # MOV EAX, DWORD PTR DS:[EAX]
                                            # POP ESI
                                            # RETN
rop += rop_align

########## EAX = kernel32 pointer, now retrieve pointer to VirtualProtect() ##########
rop += ("\x76\xE5\x12\x10" + rop_align) * 4 #0x1012E576 : 
                                            # ADD EAX,100
                                            # POP EBP
                                            # RETN
rop += "\x40\xD6\x12\x10"                   #0x1012D640 : 
                                            # ADD EAX,20
                                            # RETN
rop += "\xB1\xB6\x11\x10"                   #0x1011B6B1 : 
                                            # ADD EAX,0C
                                            # RETN
rop += "\xD0\x64\x03\x10"                   #0x100364D0 : 
                                            # ADD EAX,8
                                            # RETN
rop += "\x33\x29\x0E\x10"                   #0x100E2933 : 
                                            # DEC EAX
                                            # RETN
rop += "\x01\x2B\x0D\x10"                   #0x100D2B01 : 
                                            # MOV ECX,EAX
                                            # RETN
rop += "\xC8\x1B\x12\x10"                   #0x10121BC8 : 
                                            # MOV EAX,EDI
                                            # POP ESI
                                            # RETN

One thing to notice about the design of ROP shellcodes is that they are composed of sub-blocks. Compilers generally exhibit two behaviors when creating control flow: a) nearly all RET instructions return to an address immediately following a CALL or JMP instruction and b) all CALL and JMP instructions will next execute an instruction at the beginning of a basic block. In this discrepancy, we have two mitigation designs.

The first is called a shadow stack and the basic principle is that at each CALL instruction, we will push the address of the next instruction on a private stack prior to entering into the called function. On the next RET, we should be returning to the address that we have stored on our private stack copy:
INSTRUMENT_PROGRAM
for each IMAGE
    for each INSTRUCTION in IMAGE
        if INSTRUCTION is CALL
            push BRANCH_TARGET on SHADOW_STACK
        if INSTRUCTION is RET
            insert code to retrieve SAVED_EIP from stack
            insert CALL to ROP_VALIDATE(SAVED_EIP) before INSTRUCTION

ROP_VALIDATE
if SAVED_EIP not top of SHADOW_STACK
    exit with error
else pop top of SHADOW_STACK

The second method, branch monitoring, tracks whether the CALL or JMP is pointing to a block entry point and is just as simple and leaves less room for error:
INSTRUMENT_PROGRAM
for each IMAGE
    for each BLOCK in IMAGE
        insert BLOCK in BLOCKLIST
        for each INSTRUCTION in BLOCK
            if INSTRUCTION is RETURN or BRANCH
                insert code to retrieve SAVED_EIP from stack
                insert CALL to ROP_VALIDATE(SAVED_EIP) before INSTRUCTION

ROP_VALIDATE
if SAVED_EIP not in BLOCKLIST
    exit with error

Check out the slides and source code to see how easy it is to implement these mitigations using PIN. Less than 200 lines of source will get you both mitigations. It is also worth noting that these mitigations protect against the new technique released by Dan Rosenberg which defeats the newly implemented ROP defense in Windows 8. The method implemented by Microsoft relies upon observing the value of the stack pointer rather than the integrity of the stack itself.

JUST-IN-TIME SHELLCODE

JIT shellcode is a mitigation bypass technique that utilizes the built in JIT engines to convert attacker supplied non-executable data such as JavaScript or ActionScript into an attacker controlled executable shellcode. In the case of the ActionScript and JavaScript VMs, the code that results in the least amount of translation (and therefore the most attacker control) are arithmetic operators. In particular, it has been shown that the XOR operator will chain a mostly attacker controlled sequence of assembly instructions together in an executable area of memory.
var y=(0x11223344^0x44332211^0x44332211…);

Compiles as:0x909090: 35 44 33 22 11  XOR EAX, 11223344
0x909095: 35 44 33 22 11  XOR EAX, 11223344
0x90909A: 35 44 33 22 11  XOR EAX, 11223344

As we can see above, the immediate values we passed to a chain of XOR instructions stays intact. You may be asking how this can help us, but thanks to the ability for x86 processors to execute unaligned instructions, we can manipulate a vulnerability into executing at an offset within this now mostly controlled executable memory space. If we begin disassembling at a byte offset into the above memory, we get the following:
0x909091: 44                  INC ESP
0x909092: 33 22               XOR ESP, [EDX]
0x909094: 11 35 44 33 22 11   ADC [11223344], ESI
0x90909A: 35 44 33 22 11      XOR EAX, 11223344

Okay, so without going much further into the pain it really is to pull off a successful shellcode using this method (hat tip to Dion and Alexey for the mind-crushing prior work), what are the behaviors that will be anomalous enough that we can write a mitigation to protect against the JIT shellcode? I consulted a brief paper written by Piotr Bania which observed that the ActionScript and Javascript JIT compilers modify pages from RWX to R-E once the code has been translated to native executable opcodes. We also know that the current technique relies upon a long chain of XOR operators as well as a series of immediate values. Thus we have the following heuristic:
INSTRUMENT_PROGRAM
Insert CALL to JIT_VALIDATE at prologue to VirtualProtect

JIT_VALIDATE
Disassemble BUFFER passed to VirtualProtect
for each INSTRUCTION
    if INSTRUCTION is MOV_REG_IMM32
        while NEXT_INSTRUCTION uses IMM32
            increase COUNT
            if COUNT > THRESHOLD
                exit with error

I invite you to check out the slides for further explanation and example code to take a look at how easy it is to implement these ideas. The real-world performance hit is something that may not be appropriate for all uses, however the time to develop is so trivially small and prototyping allows you to determine the soundness of the mitigation design prior to spending the effort to implement them on the kernel or compiler level.

Code and slides are available at: http://code.google.com/p/moflow-mitigations/


References:

Bruce Dang, Daniel Radu. Shellcode Analysis Using Dynamic Binary Instrumentation. 
http://public.avast.com/caro2011/Daniel%20Radu%20and%20Bruce%20Dang%20-%20Shellcode%20analysis%20using%20dynamic%20binary%20instrumentation.pdf

Dan Rosenberg. Defeating Windows 8 ROP Mitigation. 
http://vulnfactory.org/blog/2011/09/21/defeating-windows-8-rop-mitigation/

Piotr Bania. JIT spraying and mitigations. 
http://www.piotrbania.com/all/articles/pbania-jit-mitigations2010.pdf

Alexey Sintsov. Writing JIT Shellcode for fun and profit. 
http://dsecrg.com/files/pub/pdf/Writing%20JIT-Spray%20Shellcode%20for%20fun%20and%20profit.pdf

Snort Performance and IP-Only Rules

One of the most frequent topics that comes up when I'm out speaking to customers, or when anyone from the VRT is discussing Snort on a mailing list, IRC channel, etc., is performance. Everyone wants to know how to make their rules faster - and many people are willing to go to extreme lengths to get the speed they want, devling into arcane corners of PCRE, looking for obscure Snort rule options, the works.

This wouldn't be notable, except for the fact that the people I've seen work the hardest at extreme performance tweaks in their rules are missing one of the most simple, fundamental things you can do to keep your Snort/Sourcefire boxes humming along: steering clear of IP-only rules. You know, rules that look like this:

alert ip 1.2.3.4 any -> any any (msg:"Evil stuff from IP 1.2.3.4"; classtype:bad-unknown;)

Ther problem with rules like that is that, without a static content match to supply to the fast pattern matcher, these rules will be evaluated by the main Snort engine on every packet that crosses the IDS. While each individual entry into the engine will be very brief - there's almost nothing to do - the process of entering the engine itself entails a non-zero amount of work, setting up structures, initializing memory, etc. If you're running hundreds, or even thousands, of IP-only rules (as we've observed at more than one production site in the field), the work necessary to enter the engine across potentially hundreds of millions of packets per second stacks up to the point that you end up chewing up all available system resources.

I understand why people were doing this in the past. There are plenty of groups out there who get or compile lists of known malicious IP addresses, and want to capture all activity related to those IPs; since many of those same groups have an IDS but not a full packet capture tool, they fell back to using Snort as that tool. Other groups simply wanted to ensure that their systems never visited these known malicious IPs, regardless of logging. Both groups of users spent years clamoring for a way to blacklist known-bad IP addresses in Snort; as far back as 2009, Marty Roesch was writing experimental code that provided that functionality. For many years, production systems had no way of dealing with intelligence that only provided IP addresses.

That all changed, though, with the release of Snort 2.9.1 in August of last year, however. That release included our IP reputation preprocessor, which checks IP addresses against a blacklist and/or whitelist at the time packet headers are decoded. Since this decoding is obviously already done across all packets crossing the engine, and since checking the obtained IP address against a properly formatted list in memory is exceptionally fast, this preprocessor allows you to have a list of tens of thousands or more IP addresses that you can check against with minimal performance overhead. (Note for Sourcefire customers: this feature is currently being beta tested on Sourcefire hardware.)

Many users seem to be unaware of this feature, despite it having been widely announced on the Snort Blog; others who are aware of it have been reticent to make use of it, for reasons I don't completely understand. In cases where we have seen the switch made, however, we've seen systems that were dropping up to 50% of their packets suddenly begin running smoothly again, processing everything that crossed them with ease.

So if you've got a sensor that's not up to speed, so to speak, do yourself a favor, and start with something easy - eliminating your IP-only rules in favor of the IP reputation preprocessor. You may be amazed at the difference it makes.

Thursday, April 12, 2012

Special Delivery -- Phoenix Exploit Kit

You would think that spam masquerading as a delivery company would be getting a little long in the tooth, but that isn't the case.  Last week the winner was "DHL Attention 846698", which looks something like this:


Good day!
 

Dear Consumer , Recipient's address is wrong

PLEASE FILL IN ATTACHED FILE WITH RIGHT ADDRESS AND RESEND TO YOUR PERSONAL MANAGER

With Best Wishes , DHL .com Customer Services


A nice present in the form of a zip file named "DHL-N-35385784.zip" came along with the email.  It contained an html file which, in my case, was named "DHL_Letter_N88324.htm".  This had 4 blocks of a pretty standard, obfuscated block of code that, when clicked, sent you off to a phoenix exploit kit sitting on a static IP address (no DNS name) on port 8080.

The exploit kit had a multi-capability PDF document that would exploit PDF readers with different exploits depending on what they were vulnerable to.  It also, in the case of the DVMTK (Damn Vulnerable Malware Testing Kit, or less glamorously, some Windows XP box with an old version of IE and PDF reader) also hit the Windows Help and Support Center vulnerability (CVE-2010-1885).

Snort pretty much let loose on this visit, with 8 total alerts targeting the specifics in the exploit kit's PDF delivery and for the four different malware samples that were downloaded as a result of visiting this site.  The exploit kit and PDF rules are on by default.  However, the executable downloads, SID 15306 and 11192 both need to be activated manually.  Regardless of the exploit kit we've seen, these are the SIDs that always fire.  So if you can turn them on in your environment we strongly recommend we do.

So, we would hope that user education would take care of this, but it will probably be quite a while before that will be the case.  In the meantime, keep your patches, your AV and your IDSes updated, along with any other custom in-house solutions you have.

Wednesday, April 4, 2012

Adventures in Domain Takedowns

I gave a presentation entitled "Adventures in Domain Takedowns" recently at the APCERT 2012 conference in Bali, Indonesia. The conference itself was excellent - plenty of good technical material and lots of useful contacts - and the location, of course, couldn't have been better. The most interesting part in my mind, though, was the lessons I learned in the process of putting my talk together.

First and foremost among those is the increased respect I now have for people who work at active malware takedowns on a regular basis. Getting anything substantial done in that arena requires a massive amount of work; I spent more time preparing this presentation than I have all the other talks I've done in the last two years combined.

Beyond that, I found that geography is a much less reliable indicator of whether you'll get a useful response than most people may think - I got help from China and Russia, but had plenty of places in the US and Europe ignore me. WHOIS is broken, at least in terms of getting reliable information about registrars; standard practices in security, meanwhile, are adhered to loosely at best in many places (I had 9 different registrars whose abuse@ email addresses bounced).

But most important of all was confirmation of the theory I had going in: it's all about who you know, not what you know. Having contacts in the right places is what gets business done on the Internet. The good news, for those who may not have those contacts, is that national CERT organizations do a good job of putting security researchers in touch with the right people; I would strongly urge anyone working on domain takedowns or other Internet cleanup projects to reach out to the relevant CERTs for assistance.