MIPS Multicycle Implementation

The basic idea of the multicycle implementation is to divide the one long cycle of the single cycle implementation into 3 to 5 shorter cycles. The number of cycles depends on the instruction.

Instruction fetch, program counter increment
Partial instruction decode and branch and jump target computation
Source operand fetch, ALU operation, and program counter update for branches and jumps
Memory access, if needed
Register write, if needed

Part of the advantage of the multcycle implementation is better performance due to reducing the overall instruction time for instructions that do not need a memory access or a register write.

Multicycle processor implementations use Moore or Mealy finite state machines to generate control signals.

The multicycle implementation differs from the single-cycle implementation in the following ways.

Added internal registers
ALU usage
PC update
Memory organization
Control signal generation

Registers are added to hold data that is generated in an early cycle but used in a later cycle. The following registers are added.

Instruction register
Memory data register
ALU input registers A and B
ALUOut register

The instruction register requires a new control signal IRWrite to determine when it is updated. The other added registers are updated automatically at each clock transition.

The ALU can be used for different purposes in different cycles. In addition to its single-cycle use, the ALU can do program counter increments and branch target computations.

In order to deal with all possible source operands for these operations, a new control signal ALUSrcA is added and the single-cycle control signal ALUSrc is renamed as ALUSrcB.

The program counter is updated twice. The first update is just a simple increment. The second update is a branch or jump update.

The single-cycle Branch and Jump control signals are replaced by new PCWriteCond and PCWrite control signals. The values for the two PC updates come from different sources so a new control signal PCSource is added to select between them.

The control signals are grouped according to the following instruction execution activities.

Instruction fetch
PC update
Instruction decode
Source operand fetch
ALU operation
Memory access
Register write

A read control signal is sent to memory. The contents of the program counter (PC) are used as an address. Instruction fetch is the same for all instructions.

Multicycle Changes

A memory address can come from either the PC or from the ALU. Also, the instruction needs to be held in IR for the rest of the activities.

Control Signals

MemRead
Asserted to read the instruction from memory
IorD
0 to use the program counter as the memory address
IRWrite
Asserted to capture the instruction in the instruction register

The PC gets a new value selected from the following.

PC + 4 (most instructions)
Branch target address (branch instructions)
Jump target address from the instruction (j and jal instructions)
Jump target address from a register (jr and jalr instructions)
Interrupt address (syscall, external interrupts, and exceptions)

Multicycle Changes

PC update done in two steps. The first step is a simple increment (PC <- PC + 4) done automatically while the instruction is fetched. Later modifications are made for branches, jumps, and interrupts.

Control Signals

PCWrite
Asserted to capture an updated program counter value. Used for program counter increment and branch and jump completion.
PCWriteCond
Asserted for branch instructions during the completion step where it is ANDed with the ALU Zero output.
PCSource
Selects the source for a program counter write from the ALU result (program counter increment), the contents of the ALUOut register (branch address), or the jump address.

Instruction decoding produces controls signals for the datapath and memory. The inputs to control circuitry are the opcode and function fields of the instruction. It generates the following kinds of control signals.

read and write control signals for memory
write control signals for registers
multiplexer controls for routing data through the datapath
control signals to select an appropriate ALU operation

Instruction decode is the same for all instructions.

Multicycle Changes

There are some changes in the control signals. The most important difference is that they are generated by a finite state (Moore or Mealy) machine so that they can have different values in different states.

Control Signals

Instruction decode is automatic, requiring no control signals.

The ALU is designed to combine two source operands to produce a result. The source operand fetch activity fetches the two source operands. One source operand is selected from the following.

The register specified by the rs instruction field (R-type and I-type source operand)
The program counter (PC increment and branch target base)

The other source operand is selected from the following.

The register specified by the rt instruction field (R-type source operand)
The sign extended immediate instruction field (I-type source operand)
The sign extended immediate instruction field, shifted left 2 places (branch target offset)
The constant 4 (PC increment)

Multicycle Changes

For the multicycle implementation, the ALU is also incrementing the program counter and computing branch target addresses. This requires a multiplexer for each of the source operands.

Control Signals

ALUSrcA
Selects the first ALU input from either rs or the program counter (program counter increment or branch address computation).
ALUSrcB
Selects the second ALU input from either rt, the immediate field, the constant 4 (program counter increment), or the shifted immediate field (branch address computation).

For most instructions the ALU performs the operation suggested by the instruction mnemonic, which is coded into either the opcode or the function instruction field. For loads and stores the ALU computes the address, adding the sign extended immediate field of the instruction to the contents of the register specified by the rs field of the instruction. For branches the ALU can do a subtraction in order to compare two source operands, using the result to determine whether or not to do further a further update of the PC.

Multicycle Changes

The ALU performs different operations in different states, but no new control signals are required to do this.

Control Signals

ALUOp
Determines the operation performed by the ALU: add, subtract, or decoded from function field.

A read or write control signal is sent to memory. The result from the ALU is used as an address.

Multicycle Changes

A memory address can come from either the PC or from the ALU. Also, for a read the data needs to be held in the memory data register for a later register write.

Control Signals

MemRead
Asserted for load instructions, tells memory to do a read.
MemWrite
Asserted for store instructions, tells memory to do a write.
IorD
1 to use the contents of the ALUOut register (load or store address) as the memory address.

Some instructions, such as branches, jumps, and stores, do not write to a register. For the instructions that do write to a register, the destination register can be one of the following.

The register specified by the rd field (R-type instructions)
The register specified by the rt field (I-type instructions)
$ra (jal instruction)

The value to be written to the register can come from the following places.

The ALU (most instructions)
Memory (load instructions)
The incremented PC (jal and jalr instructions)

Multicycle Changes

None.

Control Signals

RegWrite
Asserted if a result is written to a register.
RegDst
Selects the destination register as either rd for R-type instructions or rt for I-type instructions.
MemtoReg
Selects the source for register write as either ALUOut or memory.

The instruction fetch state performs the instruction fetch activity and part of the PC update activity (PC increment).

Control Signals

For instruction fetch:

MemRead
Asserted to read the instruction from memory.
IorD
0 to use the program counter as the memory address.
IRWrite
Asserted to capture the instruction in the instruction register.

For PC increment:

ALUSrcA
0 to select the program counter as the first ALU input.
ALUSrcB
01 to select the the constant 4 as the second ALU input.
ALUOp
00 to do an add operation with the ALU.
PCWrite
Asserted to capture an updated program counter value.
PCSource
00 to select the source for a program counter write from the ALU result.

The instruction decode/register fetch state performs the following activities.

instruction decode
register fetch (but not source operand selection)
branch target computation for PC update

The first two items are automatic, requiring no control signals. All of the control signals are involved in branch target computation.

Control Signals

For branch target computation:

ALUSrcA
0 to select the program counter as the first ALU input.
ALUSrcB
11 to select the sign extended left shifted immediate field as the second ALU input.
ALUOp
00 to do an add operation with the ALU.

At the end of the cycle, the branch target address will be automatically moved into the ALUOut register.

The branch completion state performs the ALU operation activity for branch instructions: a subtraction in order to compare the two source operands. Depending on the result of the comparison, the PC may be updated to the branch target address which was computed in the instruction decode/register fetch state.

Control Signals

For comparing the two source operands:

ALUSrcA
1 to select the register indicated by rs as the first ALU input.
ALUSrcB
00 to select the register indicated by rt as the second ALU input.
ALUOp
01 to do a subtract operation with the ALU.

For taking the branch conditionally:

PCWriteCond
1 to write the branch target address to the PC if the ALU zero output is 1.
PCSource
01 to select the contents of ALUOut as the source value for the write.

The following diagram shows how multicycle control circuitry is implemented as a finite state machine. The "next state logic" and "control logic" blocks are combinational logic.

Generally, finite state control can be implemented as either a Moore machine or a Mealy machine.

In a Moore machine, the control signals depend only on the control state. A Moore machine does not use the dashed connection in the figure.

In a Mealy machine, the control signals depend on both the control state and the opcode. A Mealy machine does use the dashed connection in the figure.

The state diagrams for the MIPS multicycle implementation do not include any direct dependence of control signals on the opcode. Thus they are intended for use with a Moore machine. With a Mealy machine, it is possible to bring up some control signals one cycle earlier. This fact could be used to improve the performance of the processor.