This post will walk through our coverage for the
Master Key and
Extra Field vulnerabilities. Both vulnerabilities allow arbitrary files to be added to signed APKs without breaking the digital signature. ClamAV bytecode signatures allow for flexible coverage when a vulnerability or malware family is too complex to detect with any of the other signature formats. The bytecode signature language is a subset of C with an API for interfacing with ClamAV.
The vulnerabilities have been written about exhaustively elsewhere online, the most comprehensive of which are from
@saurik:
Master Key and
Extra Field.
Zip File Format
Zip files contain a
central directory pointing to all of the files stored in the
archive. The central directory is located at the end of the file. Each file stored within the Zip file has a header immediately before its stored bytes, as well, each file has a more verbose header stored in the central directory. You can see the specifics on
Wikipedia.
Master Key Vulnerability
The
Master Key vulnerability is exploited by having multiple files with
the same name in an APK. Android's verifier and loader handle duplicate entries differently. The
verifier will check only the last duplicate entry against the SHA1 digest stored
in META-INF/MANIFEST.MF. The loader will load the first entry.
The fact that any file could be replaced is what led to the decision to use a bytecode signature to cover this vulnerability. The vulnerability creates a complex situation where the APK / Zip needs to be parsed and each file name checked against the others.
The bytecode signature first finds the last
end_of_central_directory entry's file magic in the file. The
end_of_central_directory section has information about the starting offset and size of the
central_directory. This is the equivalent of scanning backward to find the
end_of_central_directory entry.
// find the last end_of_central_directory file magic
while(end_central_dir_off != -1) {
// keep track of the previous one
last_end_off = end_central_dir_off;
// seek past it
if(seek(last_end_off+1, SEEK_SET) < 0)
return 0;
// keep doing this until the magic PK\x05\x06 is not found
end_central_dir_off = file_find("\x50\x4b\x05\x06", 4);
}
// set to the last one found
end_central_dir_off = last_end_off;
After some
seeks and
reads the bytecode signature reaches the
while loop for detecting duplicate file names. For security reasons there is no
malloc in the ClamAV bytecode engine, because of this, an O(n
2) comparison was used. There are two buffers used since each filename is tested against all those that follow it. Only when the file name lengths are equal are the names read in to their respective buffers. Then, the names are compared to each other looping backward in order to break as soon
as possible on differences. This avoids iterating over similar paths. The read in and comparison can be seen below:
// if the lengths are the same, do the comparison
if(file_name_length == compare_name_length) {
// seek to entry name
if(seek(zip_entry_off + 46, SEEK_SET) < 0)
break;
// read name
if(read(file_name_buffer, file_name_length) != file_name_length)
break;
// seek to the compare entry name
if(seek(compare_entry_off + 46, SEEK_SET) < 0)
break;
// read name
if(read(compare_buffer, file_name_length) != file_name_length)
break;
// compare names from end backward to avoid wasting time, ex:
// /res/drawable-hdpi/btn_call_1.png
// /res/drawable-hdpi/btn_call_2.png
for(i=(file_name_length-1); i > -1; i--) {
// if any character does not match, break
if(file_name_buffer[i] != compare_buffer[i])
break;
}
// if reached the end of the loop (didn't break on any comparison)
if(i == -1) {
foundVirus("Master_Key");
}
}
Extra Field Vulnerability
The
Extra Field vulnerability is exploited by a signed /
unsigned handling error in the Android's verifier. A Zip file's
central directory points to all of the files stored in the
archive. Each file has a header which has extra space available, called the
extra_field. The
extra_field, when present, is between a file's
header and the stored file. When its size is interpreted as a negative
value, the verifier will try to skip past it to the file bytes by
jumping backward. If you store the original file at the location the
verifier jumps backward to, it will be verified. Then you can place some
arbitrary file into the original file's position, causing the new file
to be loaded.
The most popular way to exploit this is to store the
classes.dex file uncompressed. Then the
extra_field_length is set to 0xFFFD (65533 unsigned, -3 signed). This causes the original file's magic (example:
dex\x0A035\x00) to overlap with the file name
classes.dex. When the verifier jumps over the
extra_field, it will jump backward 3 bytes into the file name, these bytes are shared with the start of the dex file. It will verify that the original dex file, which has been stored in the
extra_field, is unchanged. When the loader goes to load the file, it will correctly treat the
extra_field_length as unsigned short and jump forward to the new dex file.
I also realized that you could jump backward into another entry's
extra_field. It would constrain your file sizes even more, but it would still be possible. Instead of only covering 0xFFFD, the bytecode was initially looking for any value that could be interpreted as negative in the dex entry's
extra_field_length.
After reading @saurik's blog post on the
Extra Field vulnerability I realized that this coverage needed to be expanded. My logic was, initially, that the file
classes.dex (the executable code) was the only serious threat when replaced. In hindsight, this was a strange decision as I thought to cover any file for the
Master Key vulnerability but only one file for
Extra Field. There are a lot of files that could be dangerous when replaced.
The really mind blowing thing that @saurik demonstrated was an almost complete replacement of the central directory. The entries in the central directory also have an
extra_field. When its size is large enough to be interpreted as negative, the verifier will instead interpret it as zero. The usage of this vulnerability in the central directory pivots off the first entry in order to split the paths of the verifier and the loader. Each is then just directed to every other file entry using valid, specially crafted
extra_field and
comment lengths. This allows a near total replacement of a signed application's contents. This paragraph by no means does this bug technical justice, if you are interested, I highly suggest you visit the post linked above.
What does all this mean for coverage? It means we should look at every file entry in the zip file, as well as every entry in the central directory. The safest way to reach every entry in a Zip file is by reading the central directory and getting the offset from there. For this reason, coverage has been integrated into the loop checking for the
Master Key vulnerability.
// get the offset of the file header for this central dir entry
zip_entry_off = *(uint32_t *)&cd_header[42];
zip_entry_off = le32_to_host(zip_entry_off);
if(seek(zip_entry_off, SEEK_SET) < 0)
return 0;
if(read(zip_header, 30) != 30)
return 0;
// check the extra field size
extra_field_size = *(uint16_t *)&zip_header[28];
extra_field_size = le16_to_host(extra_field_size);
if(extra_field_size > 0x7FFF)
foundVirus("Extra_Field");
// go back to where we were previously
if(seek(cd_entry_off + 46, SEEK_SET) < 0)
return 0;
// check extra field size for the central directory entry
extra_field_size = *(uint16_t *)&cd_header[30];
extra_field_size = le16_to_host(extra_field_size);
if(extra_field_size > 0x7FFF)
foundVirus("Extra_Field");
Once the code has read in the central directory header to the variable
cd_header, it then retrieves the offset of that file entry in the Zip file. It seeks to that location and reads in the local file header to the variable
zip_header. It casts the
extra_field_size safely using
le16_to_host(). This function converts a 16bit little endian value to the equivalent in the host architecture's endianness. If the value is greater than 0x7FFF, that is, if it can be interpreted as a negative value, then we alert that the
Extra Field vulnerability has been found. If not, we seek back to the central directory entry and do the same check for the negative
extra_field value in the central directory.
Examples
Following are some examples of the two vulnerabilities on VirusTotal.
MD5:
04EEF623255A7CEBD943435ACF237456 - The first central directory entry at offset 0x7A5F4 has a negative
extra_field value (0x8000). Alternate central directory entries have been inserted into that space.
MD5:
C9F4C62521C04B8ADD796A1D5CEE08B0 - This sample was the first usage of the
Extra Field vulnerability spotted in the wild. It was detailed in our blog post
here. It is interesting to see the variety in names used by different vendors.
MD5:
D816596A70A7117346A2DFB6F8850E39
- This example of the
Master Key vulnerability triggers because the
file /res/drawable-xhdpi/icon.png has been inserted twice. While this is
not a malicious exploitation of the
Master Key vulnerability, it
demonstrates how thorough coverage needs to be for this vulnerability.
MD5:
DAA9C49A4645CE109B1E36DC6233DB07 - For this
Master Key sample, it looks like someone took an already malicious APK and added an extra
classes.dex file and a second AndroidManifest.xml file.