+
William Stallings
Computer Organization
and Architecture
9th Edition
+
Chapter 2
Computer Evolution and Performance
+
History of Computers
First Generation: Vacuum Tubes
ENIAC
Electronic Numerical Integrator And Computer
Designed and constructed at the University of Pennsylvania
Started in 1943 – completed in 1946
By John Mauchly and John Eckert
World’s first general purpose electronic digital computer
Army’s Ballistics Research Laboratory (BRL) needed a way to supply trajectory tables for
new weapons accurately and within a reasonable time frame
Was not finished in time to be used in the war effort
Its first task was to perform a series of calculations that were used to help determine the
feasibility of the hydrogen bomb
Continued to operate under BRL management until 1955 when it was disassembled
ENIAC
Major
Memory drawback
consisted
was the need
Occupied of 20
Contained Capable
1500 Decimal accumulators,
more of for manual
Weighed square 140 kW rather each
than 5000 programming
30 feet Power than capable
18,000 additions by setting
tons of consumption binary of
vacuum per switches
floor machine holding
tubes second and
space a
10 digit plugging/
number unplugging
cables
+
John von Neumann
EDVAC (Electronic Discrete Variable Computer)
First publication of the idea was in 1945
Stored program concept
Attributed to ENIAC designers, most notably the mathematician
John von Neumann
Program represented in a form suitable for storing in memory
alongside the data
IAS computer
Princeton Institute for Advanced Studies
Prototype of all subsequent general-purpose computers
Completed in 1952
Structure of von Neumann Machine
+
IAS Memory Formats
Both data and instructions are
The memory of the IAS stored there
consists of 1000 storage
locations (called words) of Numbers are represented in
binary form and each instruction
40 bits each is a binary code
+
Structure
of
IAS
Computer
+ Registers
Memory buffer register • Contains a word to be stored in memory or sent to the I/O unit
(MBR) • Or is used to receive a word from memory or from the I/O unit
Memory address • Specifies the address in memory of the word to be written from
register (MAR) or read into the MBR
Instruction register (IR) • Contains the 8-bit opcode instruction being executed
Instruction buffer • Employed to temporarily hold the right-hand instruction from a
register (IBR) word in memory
• Contains the address of the next instruction pair to be fetched
Program counter (PC) from memory
Accumulator (AC) and • Employed to temporarily hold operands and results of ALU
multiplier quotient (MQ) operations
+
Commercial Computers
UNIVAC
1947 – Eckert and Mauchly formed the Eckert-Mauchly
Computer Corporation to manufacture computers commercially
UNIVAC I (Universal Automatic Computer)
First successful commercial computer
Was intended for both scientific and commercial applications
Commissioned by the US Bureau of Census for 1950 calculations
The Eckert-Mauchly Computer Corporation became part of the
UNIVAC division of the Sperry-Rand Corporation
UNIVAC II – delivered in the late 1950’s
Had greater memory capacity and higher performance
Backward compatible
+
Was the major manufacturer of
punched-card processing
equipment
Delivered its first electronic
stored-program computer (701)
in 1953
Intended primarily for
scientific applications IBM
Introduced 702 product in 1955
Hardware features made it
suitable to business
applications
Series of 700/7000 computers
established IBM as the
overwhelmingly dominant
computer manufacturer
+
History of Computers
Second Generation: Transistors
Smaller
Cheaper
Dissipates less heat than a vacuum tube
Is a solid state device made from silicon
Was invented at Bell Labs in 1947
It was not until the late 1950’s that fully transistorized
computers were commercially available
Table 2.2
Computer Generations
+
Computer Generations
+
Second Generation Computers
Introduced:
Appearance of the Digital
More complex arithmetic
Equipment Corporation (DEC)
and logic units and control
units in 1957
The use of high-level
PDP-1 was DEC’s first
programming languages
computer
Provision of system software
which provided the ability This began the mini-computer
to:
phenomenon that would
load programs become so prominent in the
move data to peripherals third generation
and libraries
perform common
computations
History of Computers
Third Generation: Integrated Circuits
1958 – the invention of the integrated circuit
Discrete component
Single, self-contained transistor
Manufactured separately, packaged in their own containers, and
soldered or wired together onto masonite-like circuit boards
Manufacturing process was expensive and cumbersome
The two most important members of the third generation
were the IBM System/360 and the DEC PDP-8
+
Microelectronics
+ A computer consists of gates,
Integrated memory cells, and
interconnections among these
Circuits elements
The gates and memory cells
Data storage – provided by are constructed of simple
memory cells digital electronic components
Data processing – provided by
gates Exploits the fact that such
components as transistors,
resistors, and conductors can be
Data movement – the paths fabricated from a
among components are used semiconductor such as silicon
to move data from memory to
memory and from memory Many transistors can be
through gates to memory produced at the same time on a
single wafer of silicon
Control – the paths among
components can carry control Transistors can be connected
signals with a processor metallization to
form circuits
+
Wafer,
Chip,
and
Gate
Relationship
+
Chip Growth
Moore’s Law
1965; Gordon Moore – co-founder of Intel
Observed number of transistors that could
be put on a single chip was doubling every
year
Consequences of Moore’s law:
The pace slowed to
a doubling every 18
months in the
The cost of Computer
1970’s but has The electrical becomes
computer
sustained that rate logic and
path length is smaller and is Reduction in
Fewer
ever since shortened, more power and
memory interchip
increasing convenient to cooling
circuitry has use in a variety connections
operating requirements
fallen at a of
speed
dramatic rate environments
+
Table 2.4
Characteristics of the
System/360 Family
Table 2.4 Characteristics of the System/360 Family
Table 2.5
Evolution of the PDP-8
Table 2.5 Evolution of the PDP-8
+
DEC - PDP-8 Bus Structure
+ LSI
Large
Scale
Later Integration
Generations
VLSI
Very Large
Scale
Integration
ULSI
Semiconductor Memory Ultra Large
Microprocessors Scale
Integration
+ Semiconductor Memory
In 1970 Fairchild produced the first relatively capacious semiconductor memory
Chip was about the size Could hold 256 bits of
Non-destructive Much faster than core
of a single core memory
In 1974 the price per bit of semiconductor memory dropped below the price per bit
of core memory
There has been a continuing and rapid decline in Developments in memory and processor
memory cost accompanied by a corresponding technologies changed the nature of computers in
increase in physical memory density less than a decade
Since 1970 semiconductor memory has been through 13 generations
Each generation has provided four times the storage density of the previous generation, accompanied
by declining cost per bit and declining access time
+
Microprocessors
The density of elements on processor chips continued to rise
More and more elements were placed on each chip so that fewer
and fewer chips were needed to construct a single computer
processor
1971 Intel developed 4004
First chip to contain all of the components of a CPU on a single
chip
Birth of microprocessor
1972 Intel developed 8008
First 8-bit microprocessor
1974 Intel developed 8080
First general purpose microprocessor
Faster, has a richer instruction set, has a large addressing
capability
Evolution of Intel Microprocessors
a. 1970s Processors
b. 1980s Processors
Evolution of Intel Microprocessors
c. 1990s Processors
d. Recent Processors
+
Microprocessor Speed
Techniques built into contemporary processors include:
Pipelining
• Processor moves data or instructions into a
conceptual pipe with all stages of the pipe
processing simultaneously
Branch • Processor looks ahead in the instruction code
fetched from memory and predicts which
prediction
branches, or groups of instructions, are likely
to be processed next
Data flow • Processor analyzes which instructions are
dependent on each other’s results, or data, to
analysis create an optimized schedule of instructions
Speculative
• Using branch prediction and data flow analysis,
some processors speculatively execute
instructions ahead of their actual appearance in
execution
the program execution, holding the results in
temporary locations, keeping execution
engines as busy as possible
+
Performance
Balance
Adjust the organization and Increase the number
of bits that are
architecture to compensate retrieved at one time
by making DRAMs
for the mismatch among the “wider” rather than
“deeper” and by
capabilities of the various using wide bus data
paths
components
Reduce the
Architectural examples frequency of memory
access by
include: incorporating
increasingly
complex and
efficient cache
structures between
the processor and
main memory
Increase the
Change the DRAM interconnect
interface to make it bandwidth between
more efficient by processors and
including a cache or memory by using
other buffering higher speed buses
scheme on the DRAM and a hierarchy of
chip buses to buffer and
structure data flow
Typical I/O Device Data Rates
+
Improvements in Chip
Organization and Architecture
Increase hardware speed of processor
Fundamentally due to shrinking logic gate size
More gates, packed more tightly, increasing clock rate
Propagation time for signals reduced
Increase size and speed of caches
Dedicating part of processor chip
Cache access times drop significantly
Change processor organization and architecture
Increase effective speed of instruction execution
Parallelism
+
Problems with Clock Speed and
Login Density
Power
Power density increases with density of logic and clock speed
Dissipating heat
RC delay
Speed at which electrons flow limited by resistance and
capacitance of metal wires connecting them
Delay increases as RC product increases
Wire interconnects thinner, increasing resistance
Wires closer together, increasing capacitance
Memory latency
Memory speeds lag processor speeds
+ Processor
Trends
The use of multiple
Multicore processors on the same chip
provides the potential to
increase performance
without increasing the clock
rate
Strategy is to use two simpler
processors on the chip rather
than one more complex
processor
With two processors larger
caches are justified
As caches became larger it
made performance sense to
create two and then three
levels of cache on a chip
+
Many Integrated Core (MIC)
Graphics Processing Unit (GPU)
MIC GPU
Leap in performance as well Core designed to perform
as the challenges in parallel operations on graphics
developing software to exploit data
such a large number of cores
Traditionally found on a plug-in
The multicore and MIC graphics card, it is used to
strategy involves a encode and render 2D and 3D
homogeneous collection of graphics as well as process
general purpose processors video
on a single chip
Used as vector processors for a
variety of applications that
require repetitive computations
+ Overview
ARM
Results of decades of design effort on
complex instruction set computers Intel
(CISCs)
Excellent example of CISC design
Incorporates the sophisticated design
principles once found only on
mainframes and supercomputers
An alternative approach to processor
design is the reduced instruction set
x86 Architecture
computer (RISC)
The ARM architecture is used in a
wide variety of embedded systems
and is one of the most powerful and
best designed RISC based systems on
the market
In terms of market share Intel is CISC
ranked as the number one maker of
microprocessors for non-embedded
systems RISC
8080
First general purpose microprocessor
8-bit machine with an 8-bit data path to
memory
Used in the first personal computer (Altair)
8086
16-bit machine
Used an instruction cache, or queue
First appearance of the x86 architecture
x86 Evolution 8088
used in IBM’s first personal computer
+
80286
Enabled addressing a 16-MByte memory
instead of just 1 MByte
80386
Intel’s first 32-bit machine
First Intel processor to support multitasking
80486
More sophisticated cache technology and
instruction pipelining
Built-in math coprocessor
x86 Evolution - Pentium
Pentium Pentium Pro Pentium II Pentium III Pentium 4
• Superscalar • Increased • MMX • Additional • Includes
superscalar technology floating-point additional
• Multiple
instructions
executed in
+ organization
• Aggressive
• Designed
specifically to
instructions to
support 3D
floating-point
and other
parallel register process video, graphics enhancements
renaming audio, and software for multimedia
• Branch graphics data
prediction
• Data flow
analysis
• Speculative
execution
x86 Evolution (continued)
Core
First Intel x86 microprocessor
Instruction set with a dual core, referring to
architecture is the implementation of two
backward
compatible with processors on a single chip
earlier versions
Core 2
Extends the architecture to 64
X86
architecture bits
continues to Recent Core offerings have
dominate the
processor up to 10 processors per chip
market outside
of embedded
systems