Vulnerabilities discovered by Aleksandar Nikolic. Blog post authored by Jaeson Schultz and Aleksandar Nikolic.

One of the most fundamental tasks performed by many software programs involves the reading, writing, and general processing of files. In today's highly networked environments, files and the programs that process them can be found just about everywhere: FTP transfers, HTTP form uploads, email attachments, et cetera.

Because computer users interact with files of so many different varieties on such a regular basis, Oracle Corporation has designed tools to assist programmers with writing software that will support these everyday tasks: Outside In Technology (OIT). From the OIT website: "Outside In Technology is a suite of software development kits (SDKs) that provides developers with a comprehensive solution to extract, normalize, scrub, convert and view the contents of 600 unstructured file formats."

In April, Talos blogged about one of the OIT-related arbitrary code execution bugs patched by Oracle. The impact of that vulnerability, plus these additional eighteen OIT bugs disclosed in this post, is severe because so many third-party products use Oracle's OIT to parse and transform files. A review of an OIT-related CERT advisory from January 2016 reveals a large list of third-party products, especially security and messaging-related products, that are affected. The list of products that, according to CERT, rely on Oracle's Outside In SDK includes:

Talos has not confirmed that each of the third-party products listed above are affected. We have, however, confirmed that some are running vulnerable OIT-related code. For example, if WebReady Document Viewing is enabled for Microsoft Exchange 2013 (& earlier), an attacker could exploit these vulnerabilities by sending a malicious email attachment to a victim who then opens the email using web preview.

Further, if Data Loss Prevention is enabled, the vulnerability can be triggered simply by sending an email with a malicious attachment outbound from the affected Exchange server. If Avira AntiVir for Exchange (v12.0.2775.0 & earlier) is in place, just sending or receiving a malicious email is sufficient, since this program will scan all inbound and outbound email. Additionally, multiple OIT vulnerabilities could conceivably be exploited in a chained fashion for a more effective approach. Talos therefore encourages users to follow up with these vendors directly for more information regarding the scope of the impact of these vulnerabilities.

Table of Contents

  1. PDF /Size Integer Overflow
  2. TIFF ExtraSamples Code Execution
  3. TIFF Photometric Interpretation Code Execution
  4. GIF ImageWidth Code Execution
  5. Gem_Text Code Execution
  6. PSI Image File Code Execution
  7. Word DggInfo Code Execution
  8. Mac Works Database VwStreamSection Code Execution
  9. Mac Word ContentAccess libvs_word+63AC Code Execution
  10. BMP Heap Buffer Overflow & Code Execution
  11. Mac Works VwStreamReadRecord Memory Corruption
  12. PDF /Kids Information Leakage
  13. PDF NULL Pointer Dereference Denial of Service
  14. PDF Recursion Stack Overflow Denial of Service
  15. PDF /FlateDecode /Colors Denial of Service
  16. PDF /Type /Xref Denial of Service
  17. PDF Xref Offset Denial of Service
  18. Mac Word ContentAccess libvs_word Denial of Service
  19. Conclusion

1. PDF /Size Integer Overflow
Talos-2016-0097 (CVE-2016-3575)
The trailer object gives the "location of the cross-reference table and of certain special objects within the body of the file". In it there are several fields like /ID, /Root, /Size and /Info. /Size holds the number of objects in the PDF.


Entries in a trailer dictionary (*denotes required entry)
A "large" /Size, will cause issues with the Oracle OIT PDF parser. Despite the fact that Oracle's parser checks for integer overflow, it later multiplies the result by 4 (left shift), negating any protection offered by the previous overflow checks..

.text:B74ECE59 mov     edi, eax    [1]
.text:B74ECE5B shl     edi, 4     [2]
.text:B74ECE5E mov     [esp+6BCh+s], edi 
.text:B74ECE61 call    _SYSNativeAlloc               [3]
.text:B74ECE66 mov     edx, [esp+6BCh+arg_10]   
.text:B74ECE6D mov     [edx+1D6Ch], eax   [4]
.text:B74ECE73 test    eax, eax
.text:B74ECE75 jz      loc_B7

At [1], the value in `eax` comes straight from the 32-bit rounded value from the /Size element. At [2], it is multiplied by four therefore invalidating the integer overflow check that was done previously. A `malloc` wrapper is called at [3] and the returned pointer is saved at [4]. If a /Size value is chosen carefully, it can lead to an integer overflow at [2] in the first basic block such that a small value is passed to SYSNativeAlloc at [3]. The problem arises when, due to rounding, the heap allocator returns a pointer to a bigger heap chunk than requested.

For example, if the /Size value is specified to be 0x10000001 it will pass the check before allocation, but when shifted by 4, it becomes 0x10, making a small allocation. Depending on an underlying allocator, the actual size of the allocated chunk would be bigger. In case of Linux, the returned chunk will be 24 bytes long and subsequent `memset` will only initialize the first 16 bytes. If only first 16 bytes of the buffer are initialized, the code will be accessing memory that has not been initialized to zero. This leftover data present in uninitialized memory can cause memory corruption, potentially leading to code execution.

2. TIFF ExtraSamples Code Execution
Talos-2016-0103 (CVE-2016-3581)
TIFF files are also capable of triggering vulnerabilities that can lead to remote code execution. This vulnerability in the Oracle OIT SDK is a result of insufficient memory allocation on the heap when parsing TIFF files with the 'ExtraSamples' tag present in the Image File Directory (IFD). In this case the ImageWidth, SamplesPerPixel, BitsPerSample, and ExtraSamples values are considered standard for a TIFF file, however the inclusion of ExtraSamples is key to triggering the vulnerability. The inclusion of the ExtraSamples tag allows for a potential heap based overflow as the additional bits are not accounted for upon allocation.

3. TIFF Photometric Interpretation Code Execution
Talos-2016-0104 (CVE-2016-3582)
In 1992, the TIFF file format specification was updated, and extensions were added to accommodate new image types. Originally, TIFF files only supported four image types: Black & White, Grayscale, RGB, and Palette-Color. The updated TIFF specification included a new CMYK (color-separated) image type. To specify the TIFF image type a field called "PhotometricInterpretation" is used. A TIFF file having the "PhotometricInterpretation" level set to 5 (CMYK/color-separated format) will cause the Oracle SDK to follow an alternative code path when compared with other settings. This alternative code path allows for the ImageWidth value to be used in an unchecked allocation, and eventually creates a heap overflow.

4. GIF ImageWidth Code Execution
Talos-2016-0105 (CVE-2016-3583)
Besides PDF and TIFF, GIF files can also be a source of danger. The ImageWidth value should describe the absolute width of a given GIF, and should be smaller than the Logical Screen Width value present in the same file. This vulnerability in Oracle's Outside In SDK is triggered when parsing a GIF with an ImageWidth in an Image Descriptor block set to 0xFFFF. An ImageWidth set to 0xFFFF triggers an integer overflow, and leads to an unbounded memory write in two branches of the same function in libvs_gif.so.

5. Gem_Text Code Execution
Talos-2016-0162 (CVE-2016-3595)
GEM metafiles are files containing instructions for rendering pictures in the vector drawing program Gem Draw. An integer overflow vulnerability exists in file parsing code of Oracle Outside In Technology libim_gem2 library. While parsing a Gem metafile data, an unchecked memory allocation is performed. As a result, a specially crafted Gem file can trigger an integer overflow, leading to multiple heap based buffer overflows, and potentially, remote code execution.

6. PSI File Integer Overflow Code Execution
Talos-2016-0161 (CVE-2016-3594)
A parsing vulnerability exists in Oracle's Outside In Technology libim_psi2 library. Specifically, there is an integer overflow which leads to an erroneous memory allocation, and subsequently a large-sized memory copy operation. While parsing a PSI image file, a 2 byte size field is read and sign extended. This value is then used in memory allocation and a subsequent `memmove` call. The read size value is increased by 8 before an area of memory is allocated, but the original size is used in the `memmove` call.

7 . Word DggInfo Code Execution
Talos-2016-0160 (CVE-2015-6014) *Fixed January 2016
While parsing a malformed OLE file with a crafted DggInfo element contents, a vulnerability
in Escher drawing parsing library, libvs_eshr, can be triggered. When the ID of the first child
of DggContainer is changed from 0xF006 (Dgg) to 0xF007 (BSE), this leads to parser confusion and ultimately, a 4 byte value from the file is used as a pointer in a 'cmp' instruction. If the comparison fails, the same pointer is used in an indirect 'call' instruction leading to arbitrary code execution.

8. Mac Works Database VwStreamSection Code Execution
Talos-2016-0159 (CVE-2016-3593)
When parsing a Mac Works Database document memory is being written in a loop using
a counter with an upper value read from a byte in the file. No size checks are performed after the arithmetic operations resulting in an out-of-bounds memory write.

9. Mac Word ContentAccess libvs_word+63AC Code Execution
Talos-2016-0158 (CVE-2016-3592)
When parsing a Mac Word document a single-byte value from a file is used as a starting value for a counter which is used in arithmetic operations for memory access. No size checks are performed after the arithmetic operations resulting in an out of bounds 4 byte memory write.

10. BMP Heap Buffer Overflow & Code Execution
Talos-2016-0163 (CVE-2016-3596)
While parsing a specially crafted ICO file, an unchecked value specifying bitmap width
is used to calculate the size for the memory write operation. Compression method must be set to 0x01 or BI_RLE8. While reading the file, a piece of memory on the heap is effectively overwritten by zeros. The size of this overwrite is unchecked and comes straight from the bitmap width. This can lead to heap data structures overwrite with NULL bytes. In the supplied test case, the out of bounds null byte write overwrites a function pointer which leads to a crash. By carefully tweaking the size of the overwrite, a function pointer on the heap can be manipulated and arbitrary code execution achieved.

11. Mac Works VwStreamReadRecord Memory Corruption
Talos-2016-0157 (CVE-2016-3591)
When parsing a Mac Works Database document, memory is being written in a loop using a counter in destination address calculations. No size checks are performed after the arithmetic operations resulting in a partially controlled 2 byte overwrite.

The vulnerability is present in `VwStreamReadRecord` function in libvs_mwkd.so library (with image base at 0xB7F89000), specifically starting in the following basic block:

.text:B7F8ACF6                 movzx   eax, [esp+3Ch+var_12]  
.text:B7F8ACFB                 mov     edx, [edi+31Ch]
.text:B7F8AD01                 mov     ecx, ebp
.text:B7F8AD03                 mov     [edx+eax], cl
.text:B7F8AD06                 movzx   eax, word ptr [esp+3Ch+var_10] [1]
.text:B7F8AD0B                 movzx   esi, [esp+3Ch+var_12]   [2]
.text:B7F8AD10                 mov     [edi+eax*2+298h], si   [3]
.text:B7F8AD18                 add     word ptr [esp+3Ch+var_10], 1
.text:B7F8AD1E                 add     esi, 1
.text:B7F8AD21                 mov     [esp+3Ch+var_12], si
.text:B7F8AD26                 cmp     bp, 0F9h
.text:B7F8AD2B                 ja      loc_B7F8AE1A
.text:B7F8AD31                 test    bp, bp
.text:B7F8AD34                 jz      loc_B7F8ADEB
.text:B7F8AD3A                 mov     [esp+3Ch+var_1A], 0
.text:B7F8AD41                 jmp     short loc_B7F8AD71

At [1] and [2] pre-calculated values of `eax` and `esi` are read from the stack and zero extended. At [3] `eax` is being used in destination address calculation and the value of `si` is being written there. Initial values of `eax` and `esi` are related, `eax` serving as a counter. No bounds checking is in place resulting in a possible 2 byte out of bounds overwrite.

A specially crafted file could be used to shift the to-be-freed pointer to an attacker controlled area which can then be used to subvert the `free()` and achieve code execution.

12. PDF /Kids Information Leakage
Talos-2016-0096 (CVE-2016-3574)
The pages of a PDF document are accessed through the page tree, which defines all the pages in a document. Each node in a page tree typically has entries for /Type, /Parent, /Kids, and /Count. The /Kids reference is intended to specify all the child elements directly accessible from the current node.

However, there is a vulnerability in the way the Oracle OIT PDF parser handles the /Kids reference. While parsing a PDF file with an object that contains a malformed /Kids reference, the value right after the /Kids element is interpreted as a string, where an array of references should be located. This leads to the parser expecting a pointer where the string copied from the file is located, resulting in an arbitrary read access violation. In a properly formatted PDF file, an array of at least one reference must follow after /Kids element. The bug appears in libvs_pdf.so (with base address 0x0xB74BF000):

.text:B74E71DB mov     eax, [eax]    [1]
.text:B74E71DD mov     edi, [esp+5Ch+var_24]
.text:B74E71E1 mov     eax, [eax+edi*4]    [2]
.text:B74E71E4 mov     [esp+5Ch+var_4C], eax
.text:B74E71E8 mov     ecx, [esp+5Ch+var_34]
.text:B74E71EC mov     edx, [esp+5Ch+var_48]


At [1], `eax` points to the string copied from the file into the heap. The first four bytes of the string are used in the memory access calculation at [2] causing an arbitrary read access violation. If the value calculated at [2] ends up pointing to valid memory, the read will succeed at the controlled address. However, if the value after the /Kids element is a pure integer, a different code path is reached and the integer value is interpreted as a pointer resulting in a fully controlled arbitrary read at:

.text:B74E718A mov     eax, [esp+5Ch+var_18]
.text:B74E718E mov     eax, [eax]
.text:B74E7190 xor     edx, edx
.text:B74E7192 mov     edi, [eax+4]     [1]
.text:B74E7195 test    edi, edi
.text:B74E7197 jz      loc_B74E72A2


13. PDF NULL Pointer Dereference Denial of Service
Talos-2016-0098 (CVE-2016-3576)
When parsing a specially crafted PDF document, a NULL pointer dereference occurs, leading
to process termination. After the parser successfully decodes the /FlateDecode encoded stream data, it proceeds to execute the operators contained within. While executing a `Tj` operator on a piece of text contained in a stream, a memory structure, probably containing charset mappings, is referenced. No NULL pointer check is made and since the structure is zero initialized this can result in a crash.

14. PDF Recursion Stack Overflow Denial of Service
Talos-2016-0099 (CVE-2016-3577)
The root of a PDF document's hierarchy is the catalog dictionary, located by means of the /Root
entry in the Trailer object of the PDF file. The catalog dictionary must have the /Catalog type. While parsing a malformed PDF file which contains a reference to the /Root element with
malformed or missing an xref table, a recursive call to a function is made each time with the
same parameters. This eventually leads to a crash due to process stack exhaustion.

15. PDF /FlateDecode /Colors Denial of Service
Talos-2016-0100 (CVE-2016-3578)
While parsing a PDF file which contains a /FlateDecode encoded stream, with a set /Predictor to a value other than 1, a malformed value for /Colors causes a NULL pointer dereference in libsc_ut.so library while de-initializing the decoder.

16. PDF /Type /Xref Denial of Service
Talos-2016-0101 (CVE-2016-3579)
When parsing a PDF file with an object containing a stream, a missing object type specification
can lead to arbitrary pointer access. An ASCII integer value appearing after /Type element is converted into a 32-bit integer and subsequently used as a pointer in a comparison operation. In cases when the pointer is invalid, a process crash occurs.

17. PDF Xref Offset Denial of Service
Talos-2016-0102 (CVE-2016-3580)
A vulnerability in PDF parser of the OIT SDK exists that results in out of bounds heap memory access following an unchecked memory allocation operation under specific conditions.

In a PDF file an xref table contains multiple rows each containing three values (except for the first row which specifies the first object being referenced and the number of objects). The first value represents the 10 digit offset into the file where object is to be found. In a specially crafted PDF file, the OIT PDF parser uses the specified value as a parameter in a call to `realloc()` which can fail. The return value is checked for errors but is subsequently ignored. The original numerical value is then used as an upper bound in a loop where out of bounds read happens during process cleanup.

18. Mac Word ContentAccess libvs_word Denial of Service
Talos-2016-0156 (CVE-2016-3590)
When parsing a Mac Word document a single-byte value from a file is used as a max value for a counter which is used in arithmetic operations for memory access. No size checks are performed after the arithmetic operations resulting in an out-of-bounds memory access. Calculated memory address is used as a destination operand in `or byte` instruction.

Conclusion
Over, and over again we see problems that arise from software using untrusted data as input without proper and necessary validation of that data, and because not all software developers are experts in the multitude of file formats in existence they are forced to rely on SDKs such as Oracle's OIT. However, the unfortunate reality is that vulnerabilities that are found in an SDK that is utilized by third-parties will take additional time to patch: First the organization that maintains the SDK issues a fix, and some amount of time later, third-parties that utilize the SDK provide an update to their customers including these fixes. This provides a rather large window of time in which miscreants can exploit vulnerabilities in third-party products.