0% found this document useful (0 votes)
21 views32 pages

Week 6 A

The document discusses parallel computing approaches including interleaved execution, blocked execution, simultaneous multi-threading, chip multi-processing, and non-uniform memory access. It also covers definitions of fine-grain and coarse-grain parallelism as well as approaches to explicit multithreading.

Uploaded by

hussmalik69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views32 pages

Week 6 A

The document discusses parallel computing approaches including interleaved execution, blocked execution, simultaneous multi-threading, chip multi-processing, and non-uniform memory access. It also covers definitions of fine-grain and coarse-grain parallelism as well as approaches to explicit multithreading.

Uploaded by

hussmalik69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Parallel Computing Landscape

(CS 526)

Muhammad awais,

Department of Computer Science,


The University of Lahore,
Approaches to Explicit Multithreading
1. Interleaved Execution:
– Fine-grained
– Processor deals with two or more thread-contexts
– Switching thread at each clock cycle
– If thread is blocked it is skipped

2. Blocked Execution:
– Coarse-grained
– Thread executed until event causes delay
• E.g., Cache miss
Definitions:Granularity
• Computation / Communication Ratio:
– In parallel computing, granularity is a qualitative
measure of the ratio of computation to
communication
– Periods of computation are typically separated from
periods of communication by synchronization events.

1. Fine-grain parallelism
2. Coarse-grain parallelism
Fine-grain Parallelism
• Relatively small amounts of computational work are done between
communication events
• Low computation to communication ratio
• Implies high communication overhead and less opportunity for
performance enhancement
• If granularity is too fine it is possible that the overhead required for
communications and synchronization between tasks takes longer
than the computation.
Coarse-grain Parallelism
• Relatively large amounts of computational
work are done between
communication/synchronization events

• High computation to communication ratio

• Implies more opportunity for performance


increase

• Harder to load balance efficiently


Approaches to Explicit Multithreading
1. Interleaved Execution:
– Fine-grained
– Processor deals with two or more thread-contexts
– Switching thread at each clock cycle
– If thread is blocked it is skipped

2. Blocked Execution:
– Coarse-grained
– Thread executed until event causes delay
• E.g., Cache miss
Approaches to Explicit Multithreading
3. Simultaneous Multi-Threading (SMT)
– Instructions simultaneously issued from multiple
threads to execution units of superscalar processor
(having multiple units for decoding and execution)

• Example (SMT):
• Intel calls it hyper-threading
• SMT with support for two threads or more
• Single multithreaded processor → logically appear
as two processors
SMT Examples
4. Chip Multi-Processing (CMP)
– A complete processor is replicated on a single
chip (e.g., Multi-core processor)
– Each processor handles separate threads
Taxonomy of Processor Architectures
Tightly Coupled -
NUMA
• Non-Uniform Memory Access (NUMA)
– Access times to different regions of memory differs
CPU Topology of SunFire X4600M2
NUMA machine
Non-uniform Memory Access
(NUMA)
• Non-uniform memory access
– All processors have access to all parts of memory
– Access time of processor differs depending on
region of memory
– Different processors access different regions of
memory at different speeds

• Cache-coherent NUMA (cc-NUMA)


– Cache coherence is maintained among the caches
of the various processors
Motivation (Why NUMA)
• SMP has practical limit to number of processors
– Bus traffic limits to between 16 and 64 processors

• In clusters each node has own memory:


– Apps do not see large global memory
– Coherence maintained by software not hardware

• NUMA retains SMP flavour while giving large scale


multiprocessing
– e.g., Silicon Graphics Origin’s NUMA machines
CC-NUMA Organization
CC-NUMA Operation
• Each processor has own L1 and L2 cache
• Each node has own main memory
• Nodes connected by some networking facility
• Each processor sees single addressable memory
• Hardware support for read/write to non-local
memories, cache coherency

• Memory request order:


1. L1 cache → L2 cache (local to processor)
2. Main memory (local to node)
3. Remote memory
NUMA Pros & Cons
• Effective performance at higher levels of parallelism
than SMP

• No major software changes

• Performance can breakdown if too much access to


remote memory
Distributed Memory / Message Passing
• Each processor has access to its own memory only

• Data transfer between processors is explicit, user calls


message passing functions

• Common Libraries for message passing: MPI, PVM, etc.

• User has complete control/responsibility for data


placement and management

Interconnection Network

CPU Memory CPU Memory CPU Memory


Hybrid Systems
• Distributed memory system with multiprocessor shared
memory nodes

• Most common architecture for current generation of


parallel machines

Interconnection Network
Network Interface Network Interface Network Interface
CPU CPU CPU
Memory

Memory

Memory
CPU CPU CPU

CPU CPU CPU


Taxonomy of Processor
Architectures
Loosely Coupled - Clusters
• Collection of independent uni-processor systems or
SMPs

• Interconnected to form a cluster

• Communication via fixed path or network connections

• Not a single shared memory


Introduction to Clusters
• Alternative to SMP
• High performance
• High availability
• A group of interconnected whole computers
• Working together as unified resource
• Illusion of being one big machine
• Each computer called a node
Cluster Benefits
• Scalability
• Superior price/performance ratio
Cluster System Architecture
Cluster Middleware
• Unified image to user
– Single system image
• Single point of entry
• Single file hierarchy
• Single job management system
• Single user interface
• Single I/O space
Cluster vs. SMP
• Both provide multiprocessor support
• Both available commercially

• SMPs:
– Easier to manage and control
– Closer to single processor systems:
• Scheduling is main difference
• Less physical space required
• Lower power consumption
Cluster vs. SMP
• Clustering:
– Superior incremental scalability
– Superior availability
• Redundancy
Introduction to Grid Computing
What is a Grid?

• Many definitions exist in the literature:

• “A computational grid is a hardware and software infrastructure


that provides dependable, consistent, pervasive (large
infrastructure), and inexpensive access to high-end computational
facilities”
• Foster and Kesselman, 1998
3-point checklist (Foster 2002)
1. Coordinates/Access of resources not subject to
centralized control

2. Uses standard, open, general-purpose protocols


and interfaces

3. Deliver non-trivial Qualities of Service (QOS)


• e.g., response time, throughput, availability,
security
Grid Architecture

Autonomous, globally distributed computers/clusters


Some of the Major Grid Projects
Name URL/Sponsor Focus
EuroGrid, eurogrid.org Create tech for remote access to super
Grid Interoperability European Union comp resources & simulation codes; in
(GRIP) GRIP, integrate with Globus Toolkit™
Globus Project™ globus.org Research on Grid technologies;
DARPA, DOE, development and support of Globus
NSF, NASA, Msoft Toolkit™; application and deployment
GridLab gridlab.org Grid technologies and applications
European Union
Grid Simulation tools
• GridSim – job scheduling
• SimGrid – single client multi-server scheduling
• Bricks – scheduling
• GangSim- Ganglia Virtual Organization(VO)
• OptoSim – Data Grid Simulations
• G3S – Grid Security services Simulator – security
services

You might also like