<?xml-stylesheet type="text/xsl" href="http://www.w3.org/Math/XSL/mathml.xsl"?>
<html xmlns="http://www.w3.org/1999/xhtml">

<head>
<title> The Execution Time Equation </title>
<!--#include virtual="../common/head.html" -->
</head>

<body>

<h1> The Execution Time Equation </h1>

<hr />

<h3> The Equation </h3>

<p>
<table>
<tr> <td>
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block" color="blue">
<mrow>
    <mfrac>
	<mi>seconds</mi>
	<mi>program</mi>
    </mfrac>
    <mo> = </mo>
    <mfrac>
	<mi>instructions</mi>
	<mi>program</mi>
    </mfrac>
    <mo> * </mo>
    <mfrac>
	<mi>clocks</mi>
	<mi>instruction</mi>
    </mfrac>
    <mo> * </mo>
    <mfrac>
	<mi>seconds</mi>
	<mi>clock</mi>
    </mfrac>
</mrow>
</math>
</td> </tr>
</table>
</p>

<p>
This equation remains valid if the time units are changed on both sides of
the equation.
The left-hand side and the terms on the right-hand side are discussed in
the following sections.
</p>

<h3> Seconds to Run the Program </h3>

<p>
This is the primary concern of computer architects today.
The current philosophy of design dictates that run time should be reduuced,
even if it makes it more difficult to write programs in machine or assembly
language.
Compilers have the burden of making programming easier.
They have carried that burden well.
</p>

<h3> Instructions in the Program </h3>

<p>
This factor is affected by the complexity of the program work and the
instruction set.
</p>

<p>
Different instruction sets may do different amounts of work in a single
instruction.
CISC processor instructions can often accomplish as much as 2 or three
RISC processor instructions.
Some CISC processor instructions have built-in looping so that they can
accomplish as much as several hundred RISC instruction executions.
</p>

<h3> Seconds Per Clock </h3>

<p>
This factor is affected by technology and the complexity of work done in a
single clock.
</p>

<p>
For the past 35 years, integrated circuit technology has been greatly
affected by a scaling equation that tells how individual transistor
dimensions should be altered as the overall dimensions are decreased.
The scaling equations predict an increase in speed and a decrease in power
consumption with decreasing size.
Technology has improved so that about every 3 years, linear dimensions have
decreased by a factor of 2.
Transistor power consumption has decreased by a similar factor and speed
has increased by a similar factor.
</p>

<p>
Logic gates do not operate instantly.
A gate has a propagation delay that depends on the number of inputs to the
gate (fan in) and the number of other inputs connected to the gate's
output (fan out).
Increasing either the fan in or the fan out slows down the propagation
time.
Cycle time is set to be the worst-case total propagation time through gates
that produce a signal required in the next cycle.
</p>

<h3> Clocks Per Instruction </h3>

<p>
Clocks per instruction (CPI) is an effective average.
The notion of an effective value is used in almost all scientific disciplines.
It refers to a weighted average of some value, weighted by frequency.
That is, you multiply the value for each case by the frequency (percentage/100)
for that case, and add up all such products.
Expressed as a formula this is
<p>
<table>
<tr> <td>
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block" color="blue">
<mrow>
    <mi>effective value</mi>
    <mo> = </mo>
    <msub>
	<mi>value</mi>
	<mn>1</mn>
    </msub>
    <mo> * </mo>
    <msub>
	<mi>frequency</mi>
	<mn>1</mn>
    </msub>
    <mo> + </mo>
    <msub>
	<mi>value</mi>
	<mn>2</mn>
    </msub>
    <mo> * </mo>
    <msub>
	<mi>frequency</mi>
	<mn>2</mn>
    </msub>
    <mo> + </mo>
    <mi> ... </mi>
    <mo> + </mo>
    <msub>
	<mi>value</mi>
	<mn>n</mn>
    </msub>
    <mo> * </mo>
    <msub>
	<mi>frequency</mi>
	<mn>n</mn>
    </msub>
</mrow>
</math>
</td> </tr>
</table>
</p>
The cases for most of the applications of this formula for CPU performance are
categories of instructions.
</p>

<p>
CPI is affected by instruction-level parallelism and by instruction
complexity.
Without instruction-level parallelism, simple intructions usually take 4 or
more cycles to execute.
Pipelining (overlapping execution of instructions) can bring the average
down to near 1 clock per instruction.
Superscalar pipelining (issuing multiple instructions per cycle) can bring
the average down to a fraction of a clock per instruction.
</p>

<p>
Instructions that execute loops take at least one clock per loop iteration.
</p>

<!--#include virtual="../common/endBody.html" -->

</body>

</html>

