Program Files, Linking, and Loading

Programming involves creating files called source code files. Most programming involves two additional important types of files:

The formats for these file types are defined by an operating system. An operating system also provides support software called loaders and linkers for handling these file types. In modern operating systems this software is executed in part dynamically; that is, while the code is executing.

There are some executable files, not considered here, whose formats are not defined by the operating system. These files are handled by interpreters for languages such as Java, Perl, and Ruby.

File Types

The format of object and executable files depends on the operating system. Compilers and assemblers have to be adapted for different operating systems in order to generate output that conforms to the appropriate format.

On Microsoft Windows® platforms object file names have a .obj suffix. Compiled and assembled executable file names have a .exe suffix.

On Unix based platforms object file names have a .o suffix. There is no conventional file name suffix for compiled and assembled executable files. A Unix operating system can, however, recognize these file types by special 4-byte codes at the start of the file.

A loader takes an executable file and copies its sections into memory. Then it produces a process control block to control program execution. Finally, it starts executing the code, usually by jumping to its main address.

A loader must be able to

Relevant information must be included in the executable file format.

Some other needs have to be considered in the design of executable file formats:

The ability to break up source code for a program into smaller code units — separate compilation — has two important advantages:

But separate compilation introduces two problems:

Supporting separate compilation requires operating system software to combine the code from multiple compilation steps. This software is called a link editor or, more simply, a linker.

The object files are the result of compiling single source code files.

As is often the case in computer science, the static/dynamic distinction has the following meaning:

In keeping with this common terminology, the linking and loading described earlier is called static linking and loading. Dynamic linking and loading refers to linking and loading done during program execution. Modern operating systems typically use dynamic linking and loading for programming language library functions.

Dynamic linking and loading has three important benefits:

A jump table implementation of dynamic linking and loading is lazy - it defers loading and linking of each subprogram until it is needed. However, the loading and linking is only done once per subprogram. After it is loaded and linked, a subprogram can be called again as many times as needed with negligible overhead.

Multiple dynamically linked subprograms are typically gathered into library files called dynamic link libraries. These files typically use a .dll suffix in the Windows operating system and a .so suffix on Unix-based operating systems.