Data structures in MAL are similar to data structures in other assembly languages. This web presentation first looks at the aspects of data structures that are common to most assembly languages, then concludes with several examples for MAL for the Mars simulator.

Data structures in assembly language, as in high-level languages, are nested structures composed of references, structs, and arrays. These structures are often dynamically allocated, which gives programs the capability of adjusting their memory usage to the immediate needs.

Accessing data in structs and arrays always involves loads and stores using base-displacement addressing. The load and store instructions are the same for either static or dynamic allocation. The only difference is how the base address is placed into a register prior to the loads and stores.

In the case of arrays, random access to an entry requires an address computation prior to a load or store. However, array entries are often accessed sequentially in a loop, in which case the address register is simple incremented by the entry size after each iteration of the loop.

In addition to these examples, the searchtest.s example program shows a complete program with a static array declaration and a search function that uses random access into the array. The static array declaration is only used for convenience. The search function works for either statically or dynamically created arrays.

double x
8 bytes
float y
4 bytes
int i
4 bytes
0 8 12

The following directives declare a struct named "myStruct" with 3 members: a double, a float, and an integer. All of the entries are initialized to 0. The directives should be placed in a .data section.

        .data

myStruct:
    .double 0.0     # x, offset 0
    .float  0.0     # y, offset 8
    .word   0       # i, offset 12

This struct is called a static struct because it is allocated at assembly time. It has the following organization.

Most assemblers provide no support for naming struct members. An assembly language programmer should comment the members with names and offsets.

Putting the Address of a Struct into a Register

In order to access entries of a struct you need to put its address into a register. The following MAL code puts the address of the struct declared above into $t0. This code should be placed in a .text section.

        .text

    la      $t0, myStruct
      

Allocating a struct dynamically (at run time) requires an operating system call. The system call code depends on the operating system.

The Mars simulator provides a dynamic memory allocation system call whose name is "sbrk". Its system call code is 9. The code below dynamically allocates a struct with the following organization as in the static example.

        .text

    li      $v0, 9      # sbrk code = 9
    li      $a0, 16     # double + float + int = 16 bytes
    syscall

Putting the Address of a Struct into a Register

After this code executes, the address of the newly allocated struct will be in $v0. From there, it can be moved to another register. It is not a good idea to keep the address in $v0 since that register is conventionally used for subprogram return values. Once you have the address of the struct in a register, its members can be accessed just like they are accessed with static structs.

Depending on the operating system, the contents of dynamically allocated memory may or may not be initialized. The "sbrk" system call in the Mars simulator appears to initialize the allocated memory to all 0 bytes. However, there is nothing in the documentation indicating that this is done. If you need to initialize a struct, you can always do so with a sequence of move instructions.

Struct members can be accessed with load and store instructions using base-displacement addressing. The base register need to be loaded as described in previous sections. The displacement is the offset of the desired member.

The code below accesses members of a struct with the following organization used in the static and dynamic examples.

        .text

    l.d     $f0, 0($t0)
    l.s     $f2, 8($t0)
    lw      $t1, 12($t0)

The following directives declare three arrays: an array of 100 integers named "myInts", an array of 50 floats named "myFloats", and an array of 200 doubles named "myDoubles". All of the entries are initialized to 0. The directives should be placed in a .data section.

        .data

myInts:     .word   0:100
myFloats:   .float  0.0:50
myDoubles:  .double 0.0:200

If you wanted a 2-dimensional array, you use the same kind of declarations, using the total number of entries (number of rows times number of columns) for the repetition count that appears after the colon.

These arrays are called static arrays because they are allocated at assembly time.

Putting the Address of a Static Array into a Register

In order to access entries of an array you need to put its address into a register. The following MAL code puts the addresses of the arrays declared above into $t0, $t1, and $t2. This code should be placed in a .text section.

        .text

    la      $t0, myInts
    la      $t1, myFloats
    la      $t2, myDoubles

Accessing an Array Entry

Array entries are accessed using loads and stores of the appropriate type. You must first have the address of the entry in a register.

Allocating an array dynamically (at run time) requires an operating system call. The system call code depends on the operating system.

The Mars simulator provides a dynamic memory allocation system call whose name is "sbrk". Its system call code is 9. The following code dynamically allocates an array of 100 integers.

        .text

    li      $v0, 9      # sbrk code
    li      $a0, 400    # 100 ints = 400 bytes
    syscall

After this code executes, the address of the newly allocated array will be in $v0. From there, it can be moved to another register. It is not a good idea to keep the address in $v0 since that register is conventionally used for subprogram return values. Once you have the address of the array in a register, its entries can be accessed just like they are accessed with static arrays.

Depending on the operating system, the contents of dynamically allocated memory may or may not be initialized. The "sbrk" system calls in the Mars simulator appears to initialize the allocated memory to all 0 bytes. However, there is nothing in the documentation indicating that this is done. If you need to initialize an array, you can always do so with a simple sequential loop.

The following MAL code for the Mars simulator dynamically allocates an array of 10 ints and initializes the entries with the values 0 through 9. The comments indicate the equivalent C++ code.

        .text

    # Context:
    #   theArray address is $t0
    #   i is $t1

    # int* theArray = new int[10];
    li      $v0, 9              # 9 = sbrk
    li      $a0, 40             # 10 ints = 40 bytes
    syscall
    move    $t0, $v0

    # for (int i = 0; i < 10; i++) {
    #     *theArray = i;
    #     theArray++;
    # }
    li      $t1, 0              # i = 0
    b       cond_test
loop_top:
    sw      $t1, ($t0)          # *theArray = i
    add     $t1, $t1, 1         # i++
    add     $t0, $t0, 4         # theArray++
cond_test:
    blt     $t1, 10, loop_top   # continue if i < 10

A real operating system will probably use a different syscall code for dynamic memory allocation, and may handle the syscall parameters in different ways.

The following code declares a static array "testArray" and uses it for testing array random access code. The last 4 lines of code in the .text section could be modified to make a subprogram that returned an array entry (entryValue) given the base address of the array (arrayBase) and the desired index.

        .data

testArray:
    .word   2       # testArray[0]
    .word   3       # testArray[1]
    .word   7       # testArray[2]
    .word   11      # testArray[3]
    .word   13      # testArray[4]
    .word   17      # testArray[5]
    .word   19      # testArray[6]
    .word   23      # testArray[7]

        .text

    la      $a0, testArray
    li      $a1, 3
    # context
    #   arrayBase is in $a0
    #   index is in $a1
    # use the following variable assignments
    #   entryAddress is in $t0
    #   entryValue is in $v0
    #
    move    $t0, $a1        # entryAddress = 4*index + arrayBase
    mul     $t0, $t0, 4
    add     $t0, $t0, $a0
    lw      $v0, ($t0)      # entryValue = *entryAddress