Thursday, September 12, 2013

Inquiring Minds: Exploratory road trips, malware, and cool tools and services

While browsing interesting sandbox reports, we here in the VRT uncovered a sample that dropped three files.  VirusTotal had no record of two of them, and the third was a DLL that was well covered.

The original sample from the report was MD5 7f431295bd212b9ec45241457bee58c9, which dropped these files:
  • MD5 216a5052900d40fee35763088bdfc88f    : C:\Windows\Tasks\wopqhhb.job
  • MD5 58a6e4f3375907295bb1070e9dac839c  : C:\Documents and Settings\All Users\Application Data\Mozilla\gtbcolk.exe
  • MD5 1f3a4ac5aafe5b4db0b72d1026615827  : C:\Documents and Settings\All Users\Application Data\Mozilla\vzsleki.dll

I had been wanting to test out some tools and ideas/approaches for investigating malware, so I packed up for a trip down exploratory lane.  The first two stops on the trip:

  1. Original sample from the report: 7f431295bd212b9ec45241457bee58c9
  2. The executable it dropped: 58a6e4f3375907295bb1070e9dac839c

I've got to PE

First, I compared the PE sections (using pedump) from both the original and the dropped PE executables and noticed they were the same:

  $ pedump --sections 7f431295bd212b9ec45241457bee58c9.exe

   === SECTIONS ===


    .text        1000     1110     1200      400     0        0     0        0  60300020  R-X CODE
    .data        3000    1e7f6    1ea00     1600     0        0     0        0  c0300040  RW- IDATA
    .bss        22000        c        0        0     0        0     0        0  c0300080  RW- UDATA
    .edata      23000       36      200    20000     0        0     0        0  40300040  R-- IDATA
    .idata      24000      48c      600    20200     0        0     0        0  c0300040  RW- IDATA

   $ pedump --sections 58a6e4f3375907295bb1070e9dac839c.exe

   === SECTIONS ===


    .text        1000     1110     1200      400     0        0     0        0  60300020  R-X CODE
    .data        3000    1e7f6    1ea00     1600     0        0     0        0  c0300040  RW- IDATA
    .bss        22000        c        0        0     0        0     0        0  c0300080  RW- UDATA
    .edata      23000       36      200    20000     0        0     0        0  40300040  R-- IDATA
    .idata      24000      48c      600    20200     0        0     0        0  c0300040  RW- IDATA

On a whim, I decided to compare the SHA1 sum of a full run of pedump on each sample.  This output includes:

  • MZ Header
  • PE Header

As you can see, the output from these two different samples were exactly the same:

$ pedump 7f431295bd212b9ec45241457bee58c9.exe | shasum
7d32da086a656917ab923604e7f4724fe131359d  -
$ pedump 58a6e4f3375907295bb1070e9dac839c.exe | shasum
7d32da086a656917ab923604e7f4724fe131359d  -

From there I decided to look at the exports:

  $ pedump -E 7f431295bd212b9ec45241457bee58c9.exe 58a6e4f3375907295bb1070e9dac839c.exe
  # -----------------------------------------------
  # 7f431295bd212b9ec45241457bee58c9.exe
  # -----------------------------------------------

  === EXPORTS ===

  # module "kgufcbuni.exe"
  # flags=0x0  ts="2013-05-19 21:19:52"  version=0.0  ord_base=1
  # nFuncs=0  nNames=0

  # -----------------------------------------------
  # 58a6e4f3375907295bb1070e9dac839c.exe
  # -----------------------------------------------

  === EXPORTS ===

  # module "kgufcbuni.exe"
  # flags=0x0  ts="2013-05-19 21:19:52"  version=0.0  ord_base=1
  # nFuncs=0  nNames=0

Now this got me wondering how similar the original sample and the dropped EXE were, so I fired up vbindiff to take a look:

It seemed the only difference was the dropped EXE had an extra 8 bytes at the end of the file.  So, to confirm this, I copied the first 235,376 (0x039770) bytes of the file and checked the MD5.

Sure enough, stripping off the last 8 bytes resulted in the MD5 of the dropped sample (58a6e4f3375907295bb1070e9dac839c) and the original sample (7f431295bd212b9ec45241457bee58c9) to match up:

$ dd if=58a6e4f3375907295bb1070e9dac839c.exe bs=1 count=0x039770 2> /dev/null | md5

We have a winner?  No, but I'm searching for dinner.

Now I'm interested, in a cat-with-a-laser-pointer kind of way, but I need more samples.  This was a perfect opportunity to try out some VirusTotal Intelligence searches to find some more samples to play with.

I decided to try two different approaches:
  1. PE section searching
  2. "Similar to" searching

PE section search

For the PE section searching, I got all of the PE section MD5s for the original sample and attempted a search for samples that had these same sections:

(sectionmd5:4f5944620a6fff416596982b2e6dec23 sectionmd5:d1c09bc7d834711d74aaf05770ba22cb sectionmd5:d41d8cd98f00b204e9800998ecf8427e sectionmd5:963a67c9ee55335c1fe3ee1322167430 sectionmd5:a1bbc87f2207a3aff4a78e65d64981b5) type:peexe 

As I held my breath, I tested the first sample returned, d5ccc595265f7706824b33c4729f3a9e.
$ pedump -E d5ccc595265f7706824b33c4729f3a9e 
=== EXPORTS === 
# module "kgufcbuni.exe"
# flags=0x0 ts="2013-05-19 21:19:52" version=0.0 ord_base=1
# nFuncs=0 nNames=0

IT'S A BINGO!  Sweet, we had an obvious match, so I downloaded the first 300 samples the PE section search gave me.

"Similar to" search

Because my two samples had no entries on VirusTotal, I used the first sample uncovered with the PE section search as the basis for my similar-to search:
I grabbed the second sample (8e9fe8499a2d7acaf1124ed20b101aee) because the first sample was the sample I used for my search.  So, I was confident that the similar-to search feature was good enough to search for itself, but I wasn't actually holding my breath on this one.

So, without expecting too much, I checked the exports of the selected sample:
$ pedump -E 8e9fe8499a2d7acaf1124ed20b101aee 
=== EXPORTS === 
# module "kgufcbuni.exe"
# flags=0x0 ts="2013-05-19 21:19:52" version=0.0 ord_base=1
# nFuncs=0 nNames=0

Two bingos in one day?  I'm putting on a wig and going to a blue hair convention bingo tournament.  After that, I downloaded the first 300 samples the search results gave me.

Merge all the things!

Each of the two VirusTotal intelligence searches left me with 300 samples each; that's 601 total samples.  I then merged all of those samples into a directory so any overlapping samples would be removed.

The merge then left me with 491 unique samples, all with different SHA256 checksums.  These are the poor samples that were the unwilling patients that I poked, prodded, and probed.

Probe all the things?!

First I was curious what the pedump output for all of the samples looked like.  Were they all the same?  Were they all different?  Was the price of tea in China impacted?  Inquiring minds want to know.
$ for sample in *; do pedump $sample | shasum; done | sort | uniq -c | sort -n
  12  8f153c3902471bfc4935d49064b371bfcbb6d79a -
479 7d32da086a656917ab923604e7f4724fe131359d -

It's a landslide victory!  7d32da086a656917ab923604e7f4724fe131359d has it by 467 votes!  I wondered about the exports, so that was next:
$ for sample in *; do pedump -E $sample | grep module; done | sort | uniq -c | sort -n
 491 # module "kgufcbuni.exe"

Let's go deep, ssdeep.

Even though we were working with 491 unique samples, there have been a lot of similarities (or exact matches).  This got me wondering about how useful ssdeep might be in this particular situation.

So, I used ssdeep on the original sample and the exe it dropped.  The ssdeep hashes seem to be identical for a single byte:

  $ ssdeep -b 7f431295bd212b9ec45241457bee58c9.exe | tee 7f431295bd212b9ec45241457bee58c9.ssd

  $ ssdeep -b 58a6e4f3375907295bb1070e9dac839c.exe | tee 58a6e4f3375907295bb1070e9dac839c.ssd

Then I ran the ssdeep hashes against all 491 samples.  This resulted in a 488/491 detection:

  $ ssdeep -m 7f431295bd212b9ec45241457bee58c9.ssd -br intelligencefiles/merged/ | wc -l

  $ ssdeep -m 58a6e4f3375907295bb1070e9dac839c.ssd -br intelligencefiles/merged/ | wc -l

If I added the -a option (shows all matches even if the score is 0), and all 491 were matched:

  $ ssdeep -a -m 7f431295bd212b9ec45241457bee58c9.ssd -br intelligencefiles/merged/ | wc -l
  $ ssdeep -a -m 58a6e4f3375907295bb1070e9dac839c.ssd -br intelligencefiles/merged/ | wc -l

A dream within a dream?  We need to go deeper Yara.

I wondered, what else I could do with these 491 samples?  I landed on extracting binary patterns from the samples and trying to do something cool with them using Yara.

Thankfully, all of the heavy lifting had already been done for me, so I went to work and copied 28 semi-randomly selected samples to a directory called yara-autosig-samples/.

Using the autorule Python script, I generated the Yara signature from the 28 samples:

  $ python autorule/ yara-autosig-samples/ > yara-autosig.yar
  CFileDiffer: Diffing a total of 30 file(s)
  CFileDiffer: Diffing file 1 out of 30
  CFileDiffer: Diffing file 2 out of 30
  CFileDiffer: Diffing file 29 out of 30
  CFileDiffer: Diffing file 30 out of 30

Lo and behold, Yara detected all 491 samples:

  $ yara -r yara-autosig.yar intelligencefiles/merged > yara-autosig.log

  $ wc -l yara-autosig.log
  491 yara-autosig.log

  $ head -1 yara-autosig.log
  test   intelligencefiles/merged/0034d872dca89c5b05b4eb4cca532a470473fbd793ab52fb745e4d69a5577516

  $ tail -1 yara-autosig.log
  test intelligencefiles/merged/fcedaf314f26112aaba5e255cfa8235209521a77cb570e02ed4b16c37c137775

Back to the future?

So, after all of that I decided to look at the exports again and see if the location of "kgufcbuni.exe" was static:

  $ grep -aob kgufcbuni.exe intelligencefiles/merged/* | cut -f2- -d: | uniq -c | sort -n
   491 131112:kgufcbuni.exe

These samples all seem to drop the same DLL (1f3a4ac5aafe5b4db0b72d1026615827), so I decided to look at the exports for the offset of "ckdhgsuwa.dll":

  $ grep -aob ckdhgsuwa.dll 1f3a4ac5aafe5b4db0b72d1026615827.dll

Now that I was armed with these offsets, I decided to test my theory with a simple ClamAV LDB signature:

  $ cat local.ldb

Which decoded looks like this:

  $ cat local.ldb | sigtool --decode
  VIRUS NAME: Win.Trojan.Kryptik
  TDB: Engine:51-255,Target:1
   * SUBSIG ID 0
   +-> OFFSET: 17458
   * SUBSIG ID 1
   +-> OFFSET: 131112

For those of you who haven't gouged your eyeballs out, or taken up underwater basket weaving, here are the results of our LDB signature scan:

  $ clamscan -rid local.ldb 7f431295bd212b9ec45241457bee58c9   58a6e4f3375907295bb1070e9dac839c 1f3a4ac5aafe5b4db0b72d1026615827 intelligencefiles/merged/ | tail -10

  ----------- SCAN SUMMARY -----------
  Known viruses: 1
  Engine version: 0.97.8
  Scanned directories: 1
  Scanned files: 494
  Infected files: 494
  Data scanned: 97.44 MB
  Data read: 97.43 MB (ratio 1.00:1)
  Time: 2.549 sec (0 m 2 s)

So, there you have it.  A little bit of exploration and tinkering and who knows what you might uncover?

No comments:

Post a Comment