0% found this document useful (0 votes)
100 views4 pages

The Performance Equation

The document describes the performance equation that analyzes execution time as a product of three factors: instruction count, clocks per instruction, and clock time. It provides examples of how changing the instruction set, exploiting more instruction-level parallelism, or reducing transistor size can impact those factors. Two example problems are then given where the execution time of a program is calculated before and after a change is made to the processor.

Uploaded by

vamshi krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views4 pages

The Performance Equation

The document describes the performance equation that analyzes execution time as a product of three factors: instruction count, clocks per instruction, and clock time. It provides examples of how changing the instruction set, exploiting more instruction-level parallelism, or reducing transistor size can impact those factors. Two example problems are then given where the execution time of a program is calculated before and after a change is made to the processor.

Uploaded by

vamshi krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

seconds instructions clocks seconds

= × ×
program program instruction clock

The Performance Equation

The performance equation analyzes execution time as a product of three


factors that are relatively independent
of each other.

This equation remains valid if the time units are changed on both sides
of the equation.
The left-hand side and
the factors on the right-hand side are discussed
in the following sections.

The three factors are, in order, known as the


instruction count (IC),
clocks per instruction (CPI), and
clock time
(CT).
CPI is computed as an effective value.

Performance Equation

Instruction Count
Effective Values
Clocks Per Instruction
Clock Time
Example
Improvements

The Performance Equation

The performance equation analyzes execution time as a product of three


factors that are relatively independent
of each other.

Instruction Count

Computer architects can reduce the instruction count by adding more


powerful instructions to the instruction set.
However, this can increase either CPI or clock time, or both.

Clocks Per Instruction

Computer architects can reduce CPI by exploiting more instruction-level


parallelism.
If they add more complex
instructions it often increases CPI.

Clock Time

Clock time depends on transistor speed and the complexity of the work
done in a single clock.
Clock time can be
reduced when transistor sizes decrease.
However, power consumption increases when clock time is reduced.
This increase the amount of heat generated.

Instruction Count

Instruction (IC) count is a dynamic measure: the total number of


instruction executions involved in a program.
It
is dominated by repetitive operations such as loops and recursions.

Instruction count is affected by the power of the instruction set.


Different instruction sets may do different
amounts of work in a single
instruction.
CISC processor instructions can often accomplish as much as two or
three
RISC processor instructions.
Some CISC processor instructions have built-in looping so that they can
accomplish as much as several hundred RISC instruction executions.
For predicting the effects of incremental changes, architects use
execution traces of benchmark programs to get
instruction counts.
If the incremental change does not change the instruction set then the
instruction count
normally does not change.
If there are small changes in the instruction set then trace information
can be used to
estimate the change in the instruction count.

For comparison purposes, two machines with different instruction sets


can be compared based on compilations
of the same high-level language
code on the two machines.

Clocks Per Instruction

Clocks per instruction (CPI) is an effective average.


It is averaged over all of the instruction executions in a
program.

CPI is affected by instruction-level parallelism and by instruction


complexity.
Without instruction-level
parallelism, simple instructions usually take
4 or more cycles to execute.
Instructions that execute loops take at
least one clock per loop
iteration.
Pipelining (overlapping execution of instructions) can bring the average
for
simple instructions down to near 1 clock per instruction.
Superscalar pipelining (issuing multiple instructions per
cycle) can
bring the average down to a fraction of a clock per instruction.

For computing clocks per instruction as an effective average, the cases


are categories of instructions, such as
branches, loads, and stores.
Frequencies for the categories can be extracted from execution traces.
Knowledge of
how the architecture handles each category yields the
clocks per instruction for that category.

Clock Time

Clock time (CT) is the period of the clock that synchronizes the
circuits in a processor.
It is the reciprocal of the
clock frequency.

For example,
a 1 GHz processor has a cycle time of 1.0 ns
and a 4 GHz processor has a cycle time of 0.25 ns.

Clock time is affected by circuit technology and the complexity of the


work done in a single clock.
Logic gates
do not operate instantly.
A gate has a propagation delay that depends on the number of inputs to
the gate (fan in)
and the number of other inputs connected to
the gate's output (fan out).
Increasing either the fan in or the fan out
slows down the propagation
time.
Cycle time is set to be the worst-case total propagation time through
gates that
produce a signal required in the next cycle.
The worst-case total propagation time occurs along one or more
signal
paths through the circuitry.
These paths are called critical paths.

For the past 35 years, integrated circuit technology has been greatly
affected by a scaling equation that tells how
individual transistor
dimensions should be altered as the overall dimensions are decreased.
The scaling equations
predict an increase in speed and a decrease in
power consumption per transistor with decreasing size.
Technology has improved so that about every 3 years, linear dimensions
have decreased by a factor of 2.
Transistor power consumption has decreased by a similar factor.
Speed increased by a similar factor until about
2005.
At that time, power consumption reached the point where air cooling was
not sufficient to keep processors
cool if the ran at the highest
possible clock speed.

Problem Statement

Suppose a program (or a program task) takes 1 billion instructions to


execute on a processor running at 2 GHz.
Suppose also that 50% of the instructions execute in 3 clock cycles,
30% execute in 4 clock cycles, and 20%
execute in 5 clock cycles.
What is the execution time for the program or task?

Problem Statement
Solution
Solution

Suppose a program (or a program task) takes 1 billion instructions to


execute on a processor running at 2 GHz.
Suppose also that 50% of the instructions execute in 3 clock cycles,
30% execute in 4 clock cycles, and 20%
execute in 5 clock cycles.
What is the execution time for the program or task?

We have the instruction count: 109 instructions.


The clock time can be computed quickly from the clock rate to
be
0.5×10-9 seconds.
So we only need to to compute clocks per instruction as an effective
value:

Value Frequency Product


3 0.5 1.5
4 0.3 1.2
5 0.2 1.0
CPI = 3.7

Then we have

Execution time = 1.0×109 × 3.7 ×


0.5×10-9 sec = 1.85 sec.

Problem Statement

Suppose the processor in the previous example is redesigned so that


all instructions that initially executed in 5
cycles now execute in 4
cycles.
Due to changes in the circuitry, the clock rate has to be decreased from
2.0 GHz
to 1.9 GHz.
What is the overall percentage improvement?

Problem Statement
Solution Form
Solution

Solution Form

Suppose the processor in the previous example is redesigned so that


all instructions that initially executed in 5
cycles now execute in 4
cycles.
Due to changes in the circuitry, the clock rate has to be decreased from
2.0 GHz
to 1.9 GHz.
No changes are made to the instruction set.
What is the overall percentage improvement?

We can determine the percentage improvement quickly by first finding the


ratio between before and after
performance.
The performance equation implies that this ratio will be a product of
three factors: a performance
ratio for instruction count, a performance
ratio for CPI or its reciprocal, instruction throughput, and a
performance ratio for clock time or its reciprocal, clock frequency.
We can ignore the first factor in this problem
since it is 1.0: the
instruction count has not changed.
We are left with determining the performance ratio for CPI
and, since we
are given clock frequencies, the performance ratio for clock
frequencies.

Solution

Suppose the processor in the previous example is redesigned so that


all instructions that initially executed in 5
cycles now execute in 4
cycles.
Due to changes in the circuitry, the clock rate has to be decreased from
2.0 GHz
to 1.9 GHz.
No changes are made to the instruction set.
What is the overall percentage improvement?

The performance ratio for frequencies must be less than 1.0: if other
factors are the same then a slower clock rate
implies worse performance.
So this factor of the improvement ratio must be 1.9/2.0.

For the clocks per instruction, we had a value of 3.7 before the change.
We compute clocks per instruction after
the change as an effective
value:
Value Frequency Product
3 0.5 1.5
4 0.3 1.2
4 0.2 0.8
CPI = 3.5

Now, lower clocks per instruction means higher instruction throughput


and thus better performance, so we
expect this part of the performance
ratio to be greater than 1.0; that is, 3.7/3.5.

Then we have

3.7 1.9 7.03


performance ratio = × = = 1.0043
3.5 2.0 7.0

This is a 0.43% improvement, which is probably not worth the effort.

You might also like