PE File Structure

PE format is actually a data structure that tells Windows OS loader what information is required in order to manage the wrapped executable code.

This includes dynamic library references for linking, API export, and import tables, resource management data, and TLS data. The data structures on disk are the same data structures used in the memory and if you know how to find something in a PE file, it will help while analyzing any Windows malware samples.

DOS Header.

DOS Header occupies the first 64 bytes of the file. DOS Header there because DOS can recognize it as a valid executable and can run it in the DOS stub mode.

struct _IMAGE_DOS_HEADER{
    0X00 WORD e_magic;    //Magic DOS signature
    0X02 WORD e_cblp;     //Bytes on last page of file
    0X04 WORD e_cp;       //Pages in file
    0X06 WORD e_crlc;     //Relocations
    0X08 WORD e_cparhdr;  //Size of header in paragraphs
    0X0A WORD e_minalloc; //Minimun extra paragraphs needs
    0X0C WORD e_maxalloc; //Maximun extra paragraphs needs
    0X0E WORD e_ss;       //intial(relative)SS value
    0X10 WORD e_sp;       //intial SP value
    0X12 WORD e_csum;     //Checksum
    0X14 WORD e_ip;       //intial IP value
    0X16 WORD e_cs;       //intial(relative)CS value
    0X18 WORD e_lfarlc;   //File Address of relocation table
    0X1A WORD e_ovno;     //Overlay number
    0x1C WORD e_res[4];   //Reserved words
    0x24 WORD e_oemid;    //OEM identifier(for e_oeminfo)
    0x26 WORD e_oeminfo;  //OEM information;e_oemid specific
    0x28 WORD e_res2[10]; //Reserved words
    0x3C DWORD e_lfanew;  //Offset to start of PE header
};

As we can see we have a list of structures that came under the DOS header. We will not discuss everything as it is beyond our scope; we will discuss important ones that are required, such as e_magic and e_lfanew structure.

e_magic: Determine whether a file is a PE file. A list of file signatures can be found Here

e_lfanew: Offset relative to the beginning of the file, used to find the PE header.

As shown in the above figure e_magic value is 4D 5A (MZ) and e_lfanew is 0x00000108 (PE File header address)

DOS Stub.

A stub is a tiny program or a piece of code that is run by default when the execution of an application starts. This stub prints out the message “This program cannot be run in DOS mode” when the program is not compatible with Windows.

PE File Header.

The PE header is located by looking at the e_lfanew field of the MS-DOS Header. The e_lfanew field gives the offset of the PE header location.

The main PE Header is a structure of type IMAGE_NT_HEADERS and mainly contains PE signature, IMAGE_FILE_HEADER, and IMAGE_OPTIONAL_HEADER.

IMAGE_NT_HEADERS 

struct _IMAGE_NT_HEADERS{
    0x00 DWORD Signature;                       //PE file signature 50 45 (PE)
    0x04 _IMAGE_FILE_HEADER FileHeader;         //standard PE header
    0x18 _IMAGE_OPTIONAL_HEADER OptionalHeader; //Optional Header
};

Signature is 50 45 00 00 (PE)

Standard PE header (_IMAGE_FILE_HEADER)

struct _IMAGE_FILE_HEADER{
    0x00 WORD Machine;               //CPU platform for program execution: 0X0: any platform, 0X14C: intel i386 and subsequent processors
    0x02 WORD NumberOfSections;      //The number of blocks in the PE file
    0x04 DWORD TimeDateStamp;        //Timestamp: The total number of seconds between the time the linker generated this file and 1969/12/31-16:00P:00
    0x08 DWORD PointerToSymbolTable; //The offset position of the COFF symbol table. This field is only useful for COFF debugging information
    0x0c DWORD NumberOfSymbols;      //The number of symbols in the COFF symbol table. This value and the previous value are 0 in the release version of the program
    0x10 WORD SizeOfOptionalHeader;  //IMAGE_OPTIONAL_HEADER structure size (bytes): 32-bit default E0H, 64-bit default F0H (can be modified)
    0x12 WORD Characteristics;       //Describe file attributes, eg:
                                     //Single attribute (only 1bit is 1): #define IMAGE_FILE_DLL 0x2000 //File is a DLL.
                                     //Combined attributes (multiple bits are 1, single attribute or operation): 0X010F executable file
};

The Standard PE header is the next 20 bytes of the PE file and contains only the most basic information about the layout of the file.

Optional PE header (_IMAGE_OPTIONAL_HEADER)

struct _IMAGE_OPTIONAL_HEADER{
    0x00 WORD Magic;                       //※Magic number (magic number), 0x0107: ROM image, 0x010B: 32-bit PE, 0X020B: 64-bit PE 
    0x02 BYTE MajorLinkerVersion;          //Connector major version number
    0x03 BYTE MinorLinkerVersion;          //Connector minor version number
    0x04 DWORD SizeOfCode;                 //The total size of all code segments, note: it must be an integer multiple of FileAlignment, exists but is useless
    0x08 DWORD SizeOfInitializedData;      //The size of the initialized data, note: it must be an integer multiple of FileAlignment, exists but is useless
    0x0c DWORD SizeOfUninitializedData;    //The size of uninitialized data, note: it must be an integer multiple of FileAlignment, exists but is useless
    0x10 DWORD AddressOfEntryPoint;        //The program entry address OEP, which is an RVA (Relative Virtual Address), usually falls in .textsection, this field is applicable to DLLs/EXEs.
    0x14 DWORD BaseOfCode;                 //Code segment starting address (code base address), (the beginning of the code is not necessarily related to the program)
    0x18 DWORD BaseOfData;                 //Data segment start address (data base address)
    0x1c DWORD ImageBase;                  //Memory mirror base address (default loading starting address), default is 4000H
    0x20 DWORD SectionAlignment;           //Memory alignment: Once mapped to memory, each section is guaranteed to start from a virtual address of "multiple of this value"
    0x24 DWORD FileAlignment;              //File alignment: originally 200H, now 1000H
    0x28 WORD MajorOperatingSystemVersion; //The required operating system major version number
    0x2a WORD MinorOperatingSystemVersion; //Required operating system minor version number
    0x2c WORD MajorImageVersion;           //Customize the main version number, use the parameter settings of the connector, eg:LINK /VERSION:2.0 myobj.obj
    0x2e WORD MinorImageVersion;           //Customize the minor version number, use the parameter settings of the connector
    0x30 WORD MajorSubsystemVersion;       //The required subsystem major version number, typical value 4.0 (Windows 4.0/that is, Windows 95)
    0x32 WORD MinorSubsystemVersion;       //The required subsystem minor version number
    0x34 DWORD Win32VersionValue;          //Always 0
    0x38 DWORD SizeOfImage;                //The total image size of the PE file in memory, sizeof(ImageBuffer), a multiple of SectionAlignment
    0x3c DWORD SizeOfHeaders;              //DOS header (64B) + PE mark (4B) + standard PE header (20B) + optional PE header + total size of section table, aligned according to the file (multiple of FileAlignment)
    0x40 DWORD CheckSum;                   //PE file CRC checksum, to determine whether the file has been modified
    0x44 WORD Subsystem;                   //Subsystem type used in the user interface
    0x46 WORD DllCharacteristics;          //Always 0
    0x48 DWORD SizeOfStackReserve;         //The reserved size of the default thread initialization stack
    0x4c DWORD SizeOfStackCommit;          //The size of the thread stack actually submitted during initialization
    0x50 DWORD SizeOfHeapReserve;          //The virtual memory size reserved for the initialized process heap by default
    0x54 DWORD SizeOfHeapCommit;           //The actual submitted process heap size during initialization
    0x58 DWORD LoaderFlags;                //Always 0
    0x5c DWORD NumberOfRvaAndSizes;        //Number of directory items: always 0X00000010H(16)
    0x60 _IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES]; //define IMAGE_NUMBEROF_DIRECTORY_ENTRIES 16
};

In the Example the first member (Magic, 2Byte): the magic number 020B, which means that the file is a 64-bit PE.

The optional PE header is followed by the standard PE header, and its size is 32-bit default E0H, 64-bit default F0H bytes. The optional header contains most of the meaningful information about the executable image, such as initial stack size, program entry point location, preferred base address, operating system version, section alignment information.

Data Directories (_IMAGE_DATA_DIRECTORY)

It is the last entry of the Optional Header. The data directory indicates where to find other important components of executable information in the file. It is really nothing more than an array of IMAGE_DATA_DIRECTORY structures that are located at the end of the optional header structure. The current PE file format defines 16 possible data directories, 11 of which are now being used.

// Directory Entries

#define IMAGE_DIRECTORY_ENTRY_EXPORT         0  // Export Directory
#define IMAGE_DIRECTORY_ENTRY_IMPORT         1  // Import Directory
#define IMAGE_DIRECTORY_ENTRY_RESOURCE       2  // Resource Directory
#define IMAGE_DIRECTORY_ENTRY_EXCEPTION      3  // Exception Directory
#define IMAGE_DIRECTORY_ENTRY_SECURITY       4  // Security Directory
#define IMAGE_DIRECTORY_ENTRY_BASERELOC      5  // Base Relocation Table
#define IMAGE_DIRECTORY_ENTRY_DEBUG          6  // Debug Directory
#define IMAGE_DIRECTORY_ENTRY_COPYRIGHT      7  // Description String
#define IMAGE_DIRECTORY_ENTRY_GLOBALPTR      8  // Machine Value (MIPS GP)
#define IMAGE_DIRECTORY_ENTRY_TLS            9  // TLS Directory
#define IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG    10 // Load Configuration Directory

Each data directory entry specifies the size and relative virtual address of the directory. To locate a particular directory, you determine the relative address from the data directory array in the optional header.

struct _IMAGE_DATA_DIRECTORY{
    DWORD   VirtualAddress;
    DWORD   Size;
};

Then use the virtual address to determine which section the directory is in. Once you determine which section contains the directory, the section header for that section is then used to find the exact file offset location of the data directory.

Section Header Table

Section Header Table is an array of IMAGE_SECTION_HEADER structures and contains information related to the various sections available in the image of an executable file. The sections in the image are sorted by the RVAs rather than alphabetically.

Sections Headers Table contains the following important fields:

  • Name

  • Virtual Size

  • Virtual Address

  • Raw Size

  • Raw Address

  • Reloc Address

  • Linenumbers

  • Relocations Number

  • Linenumbers Number

  • Characteristics

Sections

PE section headers also specify the section name using using a simple character array field, called as Name. Below are the various common sections names available from an executable file:

  • .text: This is normally the first section and contains the executable code for the application. Inside this section is also an entry point of the application: the address of the first application instruction that will be executed. An application can have more than one section with the executable code.

  • .data: This section contains an initialized data of an application such as strings.

  • .rdata or .idata: Usually these section names are used for the sections where the import table is located. This is the table that lists the Windows API used by the application (along with the names of their associated DLLs). Using this, the Windows loader knows the API to find, in which system DLL, in order to retrieve its address.

  • .reloc: contains relocation information.

  • .rsrc: This is the common name for the resource-container section, which contains things like images used for the application’s UI.

  • .debug: contains debug information.

Overview, Important things to keep in mind.

Last updated