Implementing Objects

The implementation of objects must adhere to the principles of object-oriented methodology.

The most important aspect of implementing objects is providing a mechanism for dynamically selecting the method to implement a message response. This mechanism is called a dispatch mechanism. This mechanism depends on the type of the language. It has a significant impact on inheritance and determines how the language can be used.

Object-oriented languages (OOLs) fall into two broad categories:

Class-based OOLs such as Smalltalk, C++, Java, Ruby, Scala, and Eiffel.
In a class-based OOL objects are instances of and created by classes.
Classless OOLs such as Self and Haskel.
In a classless OOL objects serve as prototypes for other objects; objects are cloned from their prototypes.

Like many distinctions there is no true dichotomy here - the two categories are extremes of a spectrum. The JavaScript programming language, for example, does not fit cleanly into either category. It lies somewhere in between.

For class-based languages there is another choice. The variables in the language may be typed or they may be untyped.

Typing

Some class-based OOLs (C++, Java, Scala, and Eiffel) require that all variables be typed. Others (Smalltalk and Ruby) do not. Typing variables does not make sense in classless OOLs.

Typing is a compile-time concept. It allows the compiler to make some checks for correctness that would otherwise not be detected until a program is executed. It also allows the compiler to perform member location lookup at compile time rather than at run time.

Untyped OOLs are often used for prototyping an application. Typed OOLs are usually preferred for production applications because the compile-time error checks precisely locate errors. If these errors are not caught until run time, their cause can be very difficult to determine.

There are two important aspects of encapsulation:

Access restriction - preventing one object from accessing another's internal state, for example.
Namespaces/scopes - allowing the same name to have different meanings in different contexts.

Encapsulation mechanisms are essential for reducing couplings between software components. Many encapsulation mechanisms originated with non-object-oriented languages. Object-oriented languages add additional encapsulation mechanisms.

Object-Oriented Encapsulation

Objects are a fundamental encapsulation mechanism of object-oriented languages (OOLs).

objects - encapsulate data together with operations, called methods that process or act upon the data.

OOLs may provide additional encapsulation mechanisms.

classes - create objects that all have the same members and the same implementation of methods.
packages - group together classes that work closely with each other.
nested objects and classes - allow objects or classes to access private members of the objects or classes in which they are nested.

Inheritance

There are two types of inheritance in OOLs

interface inheritance and
implementation inheritance.

Interface inheritance is only necessary in typed OOLs. It provides the compiler with information that it need to check that objects have the methods required for messages that they receive.

Implementation inheritance mechanisms depend on the type of OOL.

For class-based OOLs, classes inherit from classes.
For classless OOLs, objects inherit from objects.

Polymorphism

Polymorphism refers to the ability of different objects to respond to the same message in different ways. Polymorphism is essential for modeling our world including our social environment. We frequently use the same phrases and sentences to mean different things in different contexts. Often there is an abstract sameness, but concrete differences.

For example, say the following sentence to two different architects and you will likely get two houses: "Build me a house". Abstractly, both architects will do the same thing but many of the details will differ. Polymorphism is the primary objective of a dispatch mechanism. However the dispatch mechanism also needs to be designed to fit the inheritance of the OOL.

Polymorphism and Alternation

Alternation is one of the early principles of programming: sequence, alternation, and repetition. A program consists of instructions that are executed sequentially (sequence) except for language constructions that allow different sequences of instructions based on conditions (alternation) and constructions that repeat sequences of instruction (repetition). Most non-object-oriented programming language provide if, if..else and case or switch statement forms as alternation constructions.

In object-oriented languages there is another alternation construction: send a message to a variable. You get different code to execute by assigning a new object to the variable. The different objects that can be assigned to a variable can respond to the message in different ways. This idea is best illustrated by design pattern for dynamic behavior.

Dispatch

Dispatch refers to the mechanism for selecting the code to be executed to respond to a message.

Parametric Polymorphism and Multiple Dispatch

The easiest form of polymorphism to implement makes method dispatch dependent only on the receiver of a message. There are some OOLs that support parametric polymorphism: dispatch also depends on parameters of the message. The dispatch mechanism is then called multiple dispatch. For languages that do not directly support multiple dispatch the techniques of the Visitor design pattern can be used to achieve the same effect.

Dispatch refers to the mechanism for selecting the code to be executed to respond to a message. This mechanism is designed to make method invocation polymorphic. It also affects how inheritance works.

When methods are compiled or interpreted, they are treated as subprograms whose first parameter is the receiving object. When a message is compiled or interpreted the receiver is supplied as the first argument. For example, for the message

      anObject.aMethod();

a compiler first generates code that locates the entry address of the machine language subprogram for the method named "aMethod". The compiler then generates code that invokes that subprogram with a reference to anObject passed as the first argument. An interpreter handles the message similarly except that it executes code rather than generating code to be executed later.

Compilers also must be able to locate instance variables. For example, consider the expression

      anObject.aVariable;

The most important issues are

What kind of mechanisms are used to locate method entry addresses and variables?
Where are method entry addresses and variables?

The answers to these questions depends on the language type.

Member Location Mechanisms

There are three important issues in dispatch: location, location, location. anonymous

There are two kinds of mechanisms used for locating object members.

positional
The member is located by specifying its position, either by an index into an array or by a displacement from the beginning of a struct. This is a fast location mechanism, but the compiler must know how the members are ordered in the array or struct.
table search by name
The members are stored in a table that is keyed by member names. The code generated by a compiler searches the table.

Classless Dispatch and Inheritance

Dispatch for classless OOLs is usually as suggested by the diagram below.

Classless OOLs are inherently untyped. When a compiler encounters a message it knows the variable that it is sent to but it knows nothing about the object referenced by that variable. The compiler must generate code for the message that works for any object that has an appropriately named method. Consequently classless OOLs must implement dispatch with lookup tables.

The implementation of dispatch behaves as if all members of anObject are copied from itsPrototype when anObject is created. However, the actual implementation may be different due to space-time tradeoffs. One possibility is that there are no copies made when anObject is created - member fetches are forwarded to itsPrototype, as in the Chain of Responsibility design pattern. Another possibility is that a member is copied into anObject's member table after the first forwarded fetch from itsPrototype.

In any implementation, when a new value is assigned to a member of anObject the new value is written to anObject's member table. This is necessary to ensure that the members of anObject are distinct from those of itsPrototype.

Class-Based Implementation

A simple implementation of class-based object structures just creates a structure for each object and puts instance variables and method entry addresses into the struct as in the following diagram.

A Space-Efficient Implementation

Since all instances of a class have the same methods, the method entry addresses can be put into class structures rather than object structures. Each object must have its own instance variables, so they must be stored in object structures. Thus dispatch for compiled class-based object-oriented languages is usually as suggested by the diagram below.

Due to differences in the amount of information available to a compiler, typed and untyped languages cannot use the same runtime structures and lookup mechanisms for the space-efficient implementation.

Untyped Class-Based OOLs

Class-based OOLs with untyped variables are like classless OOLs in that when a compiler encounters a message it knows the variable that it is sent to but it knows nothing about the object referenced by that variable. The compiler must generate code for the message that works for any object that has an appropriately named method. Consequently untyped class-based OOLs must also implement dispatch with lookup tables. Code generated by the compiler usually just uses the member name as a lookup key to find a variable or the entry address of method.

Unlike classless languages, there is an important difference between instance variable lookup and instance method lookup.

The object structure is used for instance variable lookup.
The class structure is used for instance method lookup.

There is one common case where a compiler has additional information that allows the use of positional lookup mechanisms: code in an instance method for an object accesses the object's own instance variables or methods. For example, the Smalltalk language made all instance variables private and all instance methods public. With private instance variables, the only access to an object's instance variables is by that object's methods. Thus Smalltalk compilers often implemented positional location for variables.

Typed Class-Based OOLs

For typed class-based OOLs some of the dispatch can be static. Suppose that the variable that references anObject is declared with type ParentClass, where ParentClass is the superclass of ItsClass. One simple implementation for accessing methods is to set up an array in the class structure that contains the entry addresses of their machine language subprograms. The initial part of the array for ItsClass has the same addresses as the array for ParentClass except when a method has been overridden. The rest of the array for ItsClass contains new methods defined in the code for ItsClass.

With the type information the compiler can look up the index of a method in ParentClass's method table. For the message

      anObject.method();

the compiler can generate code that fetches the entry address of the machine language subprogram at the same index in ItsClass, using anObject's reference to its class to access ItsClass. Then the compiler generates code that invokes the subprogram with a reference to anObject passed as the first argument.

Notes

This mechanism only works for single inheritance. Multiple inheritance requires a more complex lookup mechanism.
This mechanism also can be used if method entry addresses are stored in the object structures rather than class structures.
A table lookup is always required to deal with member names. In a typed class-based OOL, the compiler has enough information that it can do a lookup at compile time to get positional information that is used at run time.

Summary

In classless OOLs, objects inherit from objects. Both variables and methods are found in object structures. A table-based lookup mechanism is used at run time for both instance variables and instance methods.
In class-based OOLs, classes inherit from classes. Methods are usually located in class structures while variables are located in object structures. Simple implementations can put both variables and method entry addresses in object structures.
Untyped class-based OOLs usually use a table-based lookup mechanism at run time for both instance variable and instance methods, but may implement a positional lookup mechanism for private access to variables.
Typed class-based OOLs can use a positional lookup mechanism at run time for both instance variables and instance methods.

Comparison of Lookup Mechanisms

The table-based dispatch mechanism is somewhat slower than the positional mechanism. However, it offers the possibility of adding or replacing members of an object dynamically. This can trivialize achieving the intent of some design patterns such as the State design pattern.

The capability of dynamically modifying behavior also makes possible some design techniques that are difficult in typed class-based OOLs. The JavaScript used for handling menus in this web presentation is an example. The JavaScript interpreter in the browser provides implementations of objects representing HTML structural elements such as <ul> and <li> tags. Guided by class attributes of the tags, JavaScript code dynamically adds instance variables and methods to the menus (<ul> tags) and menu items (<li> tags). The added variables and methods support the communication between menus and menu items that is needed to implement the behavior that you see.