In this write-up, I’ll be presenting part of the Microsoft Windows Portable Executable’s (PE) Optional Header. Why? Because knowledge of the PE format is extremely important for a malware analyst and reverse engineer, plus they’re just plain fun to learn about. The PE headers are often corrupted and otherwise messed with by packers/compressors and other “file protection” software in an attempt to thwart reverse engineering and analysis. In order to be able to recognize and defeat such anti-reversing techniques on a Windows system, we must understand in detail the PE file format. Note that PEs are even used for .NET applications because they are how the OS initially loads ALL executables, prior to the _CorExeMain call to start the .NET CIL code. If you see a _CorExeMain call in the PE, that’s a .NET assembly.

When an executable is “double-clicked,” tapped, or otherwise activated on a Microsoft Windows system, many things happen in short order. One of the operating system’s chief responsibilities is memory management and thus when a new program is launched, the OS uses a module called the Loader which reads data from the executable file and it is at this time that the memory for the program to run is assigned by the operating system. A program’s memory space is divided up into sections such as the code or text section, where the programmer’s instructions are kept, the stack which is used for local variables and function calls, the heap, which is used for dynamic memory allocation and larger chunks of data, and other sections. How does the Loader (which is part of the OS) know how much memory needs to be assigned for each segment initially? The answer lies in the PE header… Part of the PE header is this data structure called the Optional Header:

typedef struct _IMAGE_OPTIONAL_HEADER {
  /*WORD                 Magic;
  BYTE                 MajorLinkerVersion;
  BYTE                 MinorLinkerVersion;
  DWORD                SizeOfCode;*/
  DWORD                SizeOfInitializedData;
  DWORD                SizeOfUninitializedData;
/*DWORD                AddressOfEntryPoint;
  DWORD                BaseOfCode;
  DWORD                BaseOfData;
  DWORD                ImageBase;
*/DWORD                SectionAlignment;
  DWORD                FileAlignment;/*
  WORD                 MajorOperatingSystemVersion;
  WORD                 MinorOperatingSystemVersion;
  WORD                 MajorImageVersion;
  WORD                 MinorImageVersion;
  WORD                 MajorSubsystemVersion;
  WORD                 MinorSubsystemVersion;
  DWORD                Win32VersionValue;
  DWORD                SizeOfImage;
  DWORD                SizeOfHeaders;
  DWORD                CheckSum;
  WORD                 Subsystem;
  WORD                 DllCharacteristics;*/
  DWORD                SizeOfStackReserve;
  DWORD                SizeOfStackCommit;
  DWORD                SizeOfHeapReserve;
  DWORD                SizeOfHeapCommit;/*
  DWORD                LoaderFlags;
  DWORD                NumberOfRvaAndSizes;
  IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER, *PIMAGE_OPTIONAL_HEADER;*/

Interestingly, the Optional Header is not actually optional. I’ve highlighted several elements of the structure that I will go into details about below:

SizeOfInitializedData

– Used by the loader to set aside the memory pages for the data which has been initialized by the programmer and/or compiler ahead of time. All initialized data will be accounted for by this number, regardless of which section it is in. Each section’s IMAGE_SECTION_HEADER is read, which notifies the loader of which sections contain initialized and uninitialized data via the DWORD Characteristics member. This happens thanks to flags inside of that member. For example, 0x00000080 is for uninitialized data while 0x00000040  is for initialized data.[1] The .rdata section will be included in this allotment because it contains all initialized data.

SizeOfUnInitializedData

As you would have guessed, this specifies to the loader the space to reserve for uninitialized data, which is .bss. From Wikipedia:

the BSS segment typically includes all uninitialized objects (both variables and constants) declared at file scope (i.e., outside any function) as well as uninitialized static local variables (local variables declared with the static keyword); static local constants must be initialized at declaration, however, as they do not have a separate declaration, and thus are typically not in the BSS section, though they may be implicitly or explicitly initialized to zero. An implementation may also assign statically-allocated variables and constants initialized with a value consisting solely of zero-valued bits to the BSS section.

SectionAlignment and FileAlignment

These two members are also interesting to us because they specify the texture of how the memory will be laid out, or “aligned” for each section. Put another way, this specifies the increments of addresses where each section will start in memory or on disk. For optimum efficiency, the data needs to be stored in memory and on disk at offsets which are multiples of the size of the unit of storage in which the data is being indexed. What does this mean? Well, on Windows, memory segments are arranged into “pages.” Just as we might arrange 150 chairs in a medium-sized lecture hall into rows of 20 chairs, RAM bytes (and virtual memory) are arranged into segments at different levels. One such level is the page which can vary in size but is currently defaulted to 4096 (0x1000) bytes in a Win NT system[2].

Similarly, when the sections’ bytes are stored on disk inside the PE file itself, they are stored in an organized fashion, split into segments of X number of bytes, where X is the FileAlignment. It makes sense that the beginning of a file section on disk would optimally be stored at the beginning of a FileAlignment multiple, which is the case. This defaults to 0x200 bytes in size, so it is quite a bit smaller than the default SectionAlignment size.

The above information is very useful when we need to search for a section either on disk or in memory, and also illustrates a way for a section to be corrupted or tampered with.

SizeOfStackReserve and SizeOfHeapReserve

This is the number of bytes that are to be set aside (reserved) by the OS for the stack and heap in virtual memory when the program loads. It’s important to note here that physical RAM is not yet touched in these operations; they are just used for the OS to allocate space in virtual memory for memory management purposes.

SizeOfStackCommit and SizeOfHeapCommit

The number of bytes that are “committed” to virtual memory for the stack and the heap, respectively.

The obvious question here is “What’s the difference between a reserve and a commit in this context?” According to MSDN, neither operation touches the physical RAM yet and further, neither zeroes out any memory. However, a commit “guarantees that when [the memory is accessed] the contents will be zero.”[3] The important word here is accessed. So, a reserve just sections off a block of virtual memory for memory management purposes within the OS, but for no other reason and even a commit doesn’t actually zero out any memory, it just tells the OS to prepare to do so when the memory is finally accessed.

NOTE: The stack is both reserved and committed to the initial thread whereas the heap is reserved and committed to the initial process when the program is loaded.

So there you have it, this is how the stack and heap are allotted memory by the OS when a program first runs. More ammo for the good ole’ “What’s the difference between the stack and the heap?” question.

Let’s take a quick look at this structure in OllyDbg for a 32-bit binary by first opening a 32-bit executable and pressing ALT+M on the keyboard:

Memory Map in OllyDbg showing PE Header

Now we double-click the PE header shown here and scroll down past the DOS Header to find the actual PE Header, and our OptionalHeader:

OptionalHeader in memory

Characteristics is the final DWORD of the FileHeader, which is in turn the member data structure that directly precedes our OptionalHeader inside of the PE Header itself, which is called IMAGE_NT_HEADERS. Phew. Lots and lots of headers.

I have been partaking in extensive study and experimentation with the Portable Executable format for the reasons mentioned at the top of this post and will continue to write articles explaining other sections of this format as well as ways in which malware authors corrupt it. Stay tuned and thanks for reading.

Bibliography

[1]
M. Pietrek, “Peering Inside the PE: A Tour of the Win32 Portable Executable File Format,” Microsoft Developer Network, Mar-1994. [Online]. Available: https://msdn.microsoft.com/en-us/library/ms809762.aspx. [Accessed: 10-Mar-2017]
[2]
“RAM, virtual memory, pagefile, and memory management in Windows,” Microsoft Support, 21-Dec-2010. [Online]. Available: https://support.microsoft.com/en-us/help/2160852/ram,-virtual-memory,-pagefile,-and-memory-management-in-windows. [Accessed: 11-Mar-2017]
[3]
“VirtualAlloc function (Windows),” MSDN. [Online]. Available: https://msdn.microsoft.com/en-us/library/aa366887(v=vs.85).aspx. [Accessed: 11-Mar-2017] [Source]