0% found this document useful (0 votes)
70 views53 pages

Unit V Digital Signal Processors

This document outlines the course structure for 'Digital Signal Processing Systems' at R.M.K. Engineering College, detailing objectives, prerequisites, syllabus, and course outcomes. It includes a comprehensive lecture plan, assessment schedule, and mapping of course outcomes with program outcomes. Additionally, it covers essential topics such as digital signal processors, filter design, and various DSP architectures.

Uploaded by

sribalaji1608
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views53 pages

Unit V Digital Signal Processors

This document outlines the course structure for 'Digital Signal Processing Systems' at R.M.K. Engineering College, detailing objectives, prerequisites, syllabus, and course outcomes. It includes a comprehensive lecture plan, assessment schedule, and mapping of course outcomes with program outcomes. Additionally, it covers essential topics such as digital signal processors, filter design, and various DSP architectures.

Uploaded by

sribalaji1608
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1

2
Please read this disclaimer before proceeding:
This document is confidential and intended solely for the educational purpose of
RMK Group of Educational Institutions. If you have received this document
through email in error, please notify the system manager. This document
contains proprietary information and is intended only to the respective group /
learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender
immediately by e-mail if you have received this document by mistake and delete
this document from your system. If you are not the intended recipient you are
notified that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.

3
R.M.K. ENGINEERING COLLEGE

22EC305 – Digital Signal


Processing Systems
Department : Electronics Engineering(VLSI Design
and Technology)

Batch/Year : 2023-2027 / II year

Created by : Dr. M.S.Kavitha/ASP

Date : 1.10.2024

4
Table of Contents
S.No Contents Page
Number

1 Course Objectives 7

2 Pre-Requisites 8

3 Syllabus 9

4 Course outcomes 10

5 CO- PO/PSO Mapping 11

6 Unit 5 – Digital Signal Processors 12

13
6.1 Lecture Plan
14
6.2 Activity based learning
15
6.3 Lecture Notes
➢ General DSP Architecture 17

➢ Fixed Vs Floating 25

➢ TMS320 Family Overview 26

➢ Addressing Modes 33

➢ Programming 36

➢ Circular Buffering 37

41
6.4 Assignments
42
6.5 Part A Q & A
46
6.6 Part B Qs
47
6.7 Supportive online Certification courses
48
6.8 Real time Applications in day-to-day life and
to Industry 5
S.No Contents Page
Number
49
6.9 Contents beyond the Syllabus
7 Assessment Schedule 50

8 Prescribed Text Books & Reference Books 51

9 Mini Project suggestions 52

6
1. COURSE OBJECTIVES

OBJECTIVES:
▪ To Examine the LTI systems using Z Transform.

▪ To Learn Discrete Fourier Transform and Fast Fourier Transform.

▪ To design and implement digital filters.

▪ To describe the characteristics of IIR filters and design IIR filters for given
specifications.

▪ To familiarize different design methods available for FIR filters and its realization
structures.

▪ To classify the characteristics and architectural features of Digital Signal Processors.

7
2. PRE-REQUISITES

22MA201 TRANSFORMS AND NUMERICAL METHODS

By learning this course, the Student will gain knowledge about Laplace Transform
and Z-Transforms in solving differential and difference equations in this course.

8
3. SYLLABUS

22EC305 DIGITAL SIGNAL PROCESSING SYSTEMS L T P C 3 0 0 3

UNIT I LINEAR TIME INVARIANT DISCRETE TIME SYSTEMS 9

Difference Equations- Block Diagram Representation- Impulse Response –


Convolution – Linear and Circular Convolution – Z Transform analysis of Discrete
Time System-Properties of Z Transform.

UNIT II DISCRETE FOURIER TRANSORM 9

Discrete Fourier transform (DFT) and its properties - periodicity, symmetry and
circular convolution. FFT Algorithm- Radix -2 DIT FFT, Radix-2 DIF FFT- overlap
save and overlap add method.

UNIT III INFINITE IMPULSE RESPONSE FILTERS 9

Analog filters - Butterworth filters, Chebyshev Type I filters (Up to 2nd order)
Transformation of analog filters into equivalent digital filters using Bilinear Z
Transform method - Realization Structures for IIR filters- direct, cascade and
parallel forms.

UNIT III FINITE IMPULSE RESPONSE FILTERS 9

Design of linear phase FIR filters using Fourier series and windowing method -
Rectangular, Hamming and Hanning window- Realization structures for FIR filters
– Transversal and linear phase structures – Comparison of FIR and IIR Filters.

UNIT V DIGITAL SIGNAL PROCESSORS 9

DSP Architectures Harvard, Von Neuman, VLIW – Types of Digital Signal


Processors – Pipelining – Multiply and Accumulate Unit –TMS 320C5X DSP
Architecture and addressing modes.

TOTAL : 45 PERIODS

9
4. COURSE OUTCOMES

After successful completion of the course, the students should be able to

Course Level in Bloom’s


Description
Outcomes Taxonomy
Interpret Discrete Time LTI Systems
C305.1 K3
using Z Transform

C305.2 Determine DFT for the given Sequence K3


Apply the Fast Fourier Transform (FFT)
C305.3 K3
for discrete time signals
C305.4 Realize IIR filters for given specification K3

Realize FIR Filters using different


C305.5 K3
methods

Summarize the characteristics and


C305.6 architectural features of Digital Signal K2
Processors

10
5. CO- PO/PSO Mapping

MAPPING OF COURSE OUTCOMES WITH PROGRAM OUTCOMES:

Program
Program Outcomes Specific
Course Leve
Outcomes
Outcom l of K3,
es CO K3 K4 K4 K5 K5, A3 A2 A3 A3 A3 A3 A2 K5 K5 K3
K6
PO- PO- PO- PO- PO- PO- PO- PO- PSO- PSO- PSO-
PO-1 PO-2 PO-3 PO-12
4 5 6 7 8 9 10 11 1 2 3
C305.1 K3 3 3 2 2 2 1 - - - - 1 1 - 2 3
C305.2 K3 3 3 2 2 2 1 - - - - 1 1 - 2 3
C305.3 K3 3 3 2 2 2 1 - - - - 1 1 - 2 3
C305.4 K3 3 3 2 2 2 1 - - - - 1 1 - 2 3
C305.5 K3 2 3 3 2 2 1 - - - - 1 1 - 1 2
C305.6 K2 2 3 3 2 2 1 - - - - 1 1 - 1 2
C305 3 3 2 2 2 1 - - - - 1 1 - 2 3

11
UNIT-5
DIGITAL SIGNAL PROCESSORS

12
6.1 LECTURE PLAN
UNIT V – DIGITAL SIGNAL PROCESSORS

Mode of Delivery
Taxonomy level
Proposed Date
No. of Periods

Pertaining CO
Actual Date

Reason for
Deviation
S.No

Topic

Types of Digital
1 Signal processors 1 CO6 K2 PPT
and Applications
Functionalities of
2 Digital Signal 1 CO6 K2 PPT
processors
MAC & Circular
3 1 CO6 K2 PPT
Buffering
Architecture of TMS
4 1 CO6 K2 PPT
320C5X
Architecture of TMS
5 1 CO6 K2 PPT
320C54X
6 VLIW Architecture 1 CO6 K2 PPT

7 Addressing Modes 1 CO6 K2 PPT

Instruction Sets-
8 1 CO6 K2 PPT
Introduction
Programming-
9 1 CO6 K3 PPT
Introduction

Total No. of Periods : 09

13
6.2 ACTIVITY BASED LEARNING TO EXPERIENCE
IMAGE PROCESSING – APPLICATION OF DSP
Take an 8 ½ x 11 piece of paper, fold it and cut it as shown below.

Now have a partner hold the paper from the top. Keep your fingers apart around the
paper about two inches below your partner’s fingers. Ask your partner to randomly
let go of the paper. Are you able to catch the paper with your fingers before it
passes your fingers?

If the answer is no, here is why! Our brain uses image processing to determine
when your partner let go of the paper and then sends signals to your hand for your
fingers to catch it. However, your system has a delay between when your brain
sends the signal and when your muscles act. That delay is long enough to allow the
paper to slip past your fingers. The whole process from “seeing” the paper falling to
“acting” takes time. There is minimum time that we cannot reduce. The process is
faster than what we can handle. We wish we had a shorter processing time.

14
6.3 Lecture Notes
Unit – 5 Digital Signal Processors

DSP Architectures Harvard, Von Neuman, VLIW – Types of


Digital Signal Processors – Pipelining – Multiply and Accumulate
Unit –TMS 320C5X DSP Architecture and addressing modes.

15
UNIT-5 DIGITAL SIGNAL PROCESORS

What is A DSP Processor?


• It is a type of processor which is generally used to process real time data.
• DSP applications such as convolution , correlation need array
multiplication.
• In such cases it is required that multiplication should be completed before
arrival of next input sample in the array.
• Most DSP algorithms involve repetitive arithmetic operations such as
multiply and add, multiple
memory access , heavy dataflow through CPU.
• For these functions to be performed advanced DSP architecture is required

What is a DSP?

A digital signal processor (DSP) is an integrated circuit designed for high-speed


data manipulations, and is used in
Audio Communications image manipulation
Other data-acquisition and Data-control applications.

• A specialized microprocessor for real-time DSP applications


• Digital filtering (FIR and IIR)
• FFT
• Convolution and Matrix Multiplication etc.,

Hardware used in Digital Signal Processing

ASIC FPGA GPP DSP

Performance Very High High Medium Medium


High
Flexibility Very low High High High

Power Very low low Medium Low


consumption Medium
Development Time Long Medium Short Short

16
5.1 GENERAL DSP ARCHITECTURE

DSP Blocks
The internal hardware of a digital signal processor consists of many
blocks:
1. CPU
2. Arithmetic Logic Unit (ALU)
3. Accumulators
4. Barrel shifter
5. Multiplier unit
6. Compare Select and Store Unit ( CSSU )
7. Memory cache
8. DMA controller

DSP memory architecture


The DSP architecture is of three types:
Von Neumann Architecture
Harvard Architecture
Super Harvard Architecture (SHARC)

17
Von Neumann Architecture

• Von Neumann architecture contains a single memory and a single bus for
transferring data into and out of the central processing unit (CPU).
• Multiplying two numbers requires at least three clock cycles. We don't count
the time to transfer the result back to memory, because we assume that it
remains in the CPU for additional manipulation (such as the sum of
products in an FIR filter).
• The Von Neumann design is quite satisfactory when you are content to
execute all of the required tasks in serial.
Harvard Architecture
• It has separate memories for data and program instructions, with
separate buses for each. Since the buses operate independently, program
instructions and data can be fetched at the same time, improving the
speed over the single bus design.
• This architecture increases the speed of computation as compared to
Von Neumann architecture.

Super Harvard Architecture (SHARC)


• ® DSPs, a contraction of the longer term, Super Harvard
ARChitecture.
• SHARC DSPs are optimized by addition of: an instruction cache,
and an I/O controller.

18
INSTRUCTION CACHE
• DSP algorithms generally spend most of their execution time in loops.
• The same set of program instructions will continually pass from program
memory to the CPU.
• By including an instruction cache in the CPU we can speed up the execution.

I/O CONTROLLER
• The SHARC DSPs provides both serial and parallel communications ports.

• These are extremely high speed connections.

• For example, at a 40 MHz clock speed, there are two serial ports that operate at
40 Mbits/second each,
• Thus the I/O port helps in faster execution.

Common DSP features

• Harvard architecture
• Dedicated single-cycle Multiply-Accumulate (MAC) instruction (hardware MAC
units)
• Single-Instruction Multiple Data (SIMD) Very Large Instruction Word (VLIW)
architecture
• Pipelining
• Saturation arithmetic
• Zero overhead looping
• Hardware circular addressing
• Cache
• DMA

19
MULTIPLIER-ACCUMULATOR UNIT ( MAC )

• DSP operations involve many time-consuming multiplications and additions.


• To make real-time operation faster multiplier-accumulator (MAC) unit using
fixed or floating point arithmetic is mandatory.
• The MAC unit consists of a multiplier that has a pair of input registers that holds
the inputs to the multiplier and a 32 bit product register which holds the result
of a multiplication.
• The output of the product register is connected to a double precision
accumulator where the products are accumulated.
• Floating point MACs allow fast computation with minimal errors.

Single-Cycle MAC unit

ai xi

Multiplier
a i-1 x i-1

ai xi
n
Adder
Σ(a ix i )
i=0
ai xi + a i-1 x i-1

Register
Can compute a sum of n-products in n cycles

20
Single Instruction - Multiple Data (SIMD)
A technique for data-level parallelism by employing a number of processing
elements working in parallel
225px-SIMD

Very Long Instruction Word (VLIW)

A technique for instruction-level parallelism by executing instructions without


dependencies (known at compile-time) in parallel
Example of a single VLIW instruction:
F=a+b; c=e/g; d=x&y; w=z*h;

VLIW instruction F=a+b c=e/g d=x&y w=z*h

a
F
PU
b

e c
PU
g

x d
PU
y

z w
PU
h

21
CISC vs. RISC vs. VLIW

Pipelining
Instruction cycle requires Four Phases :
1.Fetch phase in which the instruction is fetched from the program
memory
2.Decode phase in which the instruction is decoded
3.Memory read phase in which the operand required for the execution of the instruction
read from the data memory
4.Execution phase in which execution as well as the storage of the results in either on of
registers or memory is carried out
Instruction cycles of processor with no pipelining

22
Instruction cycles of processor with pipelining

Saturation Arithmetic

• Fixed range for operations like addition and multiplication


• Normal overflow and underflow produce the maximum and minimum allowed value,
respectively
• Associativity and distributivity no longer apply
• 1 signed byte saturation arithmetic examples:
• 64 + 69 = 127
• -127 – 5 = -128
• (64 + 70) – 25 = 102 ≠ 64 + (70 -25) = 109

Zero Overhead Looping

Hardware support for loops with a constant number of iterations using hardware
loop counters and loop buffers
No branching
No loop overhead
No pipeline stalls or branch prediction
No need for loop unrolling

23
Hardware Circular Addressing
• A data structure implementing a fixed length queue of fixed size objects
where objects are added to the head of the queue while items are removed from
the tail of the queue.
• Requires at least 2 pointers (head and tail)
• Extensively used in digital filtering
Head

X[n]

X[n-1]

X[n]

Cycle1

X[n-1] X[n-2] Cycle2

X[n-2]

X[n-3] X[n-3]

Cache memory Tail

• Separate instruction and data L1 caches (Harvard architecture)


• Cache coherence protocols required, since most systems use DMA

Direct Memory Access (DMA)

The feature that allows peripherals to access main memory without the
intervention of the CPU
Typically, the CPU initiates DMA transfer, does other operations while the transfer
is in progress, and receives an interrupt from the DMA controller once the
operation is complete.
Can create cache coherency problems (the data in the cache may be different
from the data in the external memory after DMA)
Requires a DMA controller

24
DSP vs Microcontroller

DSP Microcontroller

Harvard Architecture Mostly von Neumann


Architecture
VLIW/SIMD (parallel execution Single execution unit
units)
No bit level operations Flexible bit-level operations

Hardware MACs No hardware MACs

DSP applications Control applications

5.2 Fixed vs. Floating Point


Digital Signal Processing can be divided into two categories, fixed
point and floating point. These refer to the format used to store and manipulate
numbers within the devices. Fixed point DSPs usually represent each number with a
minimum of 16 bits, although a different length can be used. For instance, Motorola
manufactures a family of fixed point DSPs that use 24 bits. There are four common
ways that these 216 = 65536 possible bit patterns can represent a number.
In unsigned integer, the stored number can take on any integer value from 0 to
65,535. Similarly, signed integer uses two's complement to make the range include
negative numbers, from -32,768 to 32,767. With unsigned fraction notation, the
65,536 levels are spread uniformly between 0 and 1. Lastly, the signed
fraction format allows negative numbers, equally spaced between -1 and 1.
In comparison, floating point DSPs typically use a minimum of 32 bits to store each
value. This results in many more bit patterns than for fixed point, 232 =
4,294,967,296 to be exact. A key feature of floating point notation is that the
represented numbers are not uniformly spaced. In the most common format
(ANSI/IEEE Std. 754-1985), the largest and smallest numbers are ±3.4×1038 and
1.2X10-38, respectively. The represented values are unequally spaced between these
two extremes, such that the gap between any two numbers is about ten-million
times smaller than the value of the numbers. This is important because it places
large gaps between large numbers, but small gaps between small numbers.
All floating point DSPs can also handle fixed point numbers, a necessity to
implement counters, loops, and signals coming from the ADC and going to the DAC.
However, this doesn't mean that fixed point math will be carried out as quickly as
the floating point operations; it depends on the internal architecture. For instance,
the SHARC DSPs are optimized for both floating point and fixed point operations,
and executes them with equal efficiency. For this reason, the SHARC devices are
often referred to as "32-bit DSPs," rather than just "Floating Point."

25
• Fixed point arithmetic is much aster than floating point in general purpose
computers. However, with DSPs the speed is about the same, a result of the
hardware being highly optimized for math operations. The internal architecture of
a floating point DSP is more complicated than for a fixed point device. All the
registers and data buses must be 32 bits wide instead of only 16; the multiplier
and ALU must be able to quickly perform floating point arithmetic, the instruction
set must be larger (so that they can handle both floating and fixed point
numbers), and so on. Floating point (32 bit) has better precision and a higher
dynamic range than fixed point (16 bit) . In addition, floating point programs
often have a shorter development cycle, since the programmer doesn't generally
need to worry about issues such as overflow, underflow, and round-off error.
• On the other hand, fixed point DSPs have traditionally been cheaper than floating
point devices. Nothing changes more rapidly than the price of electronics. Cost is
a key factor in understanding how DSPs are evolving,

 Fixed point – performs integer operations


 Floating point – performs both integer and floating point processors

• Fixed point – TMS320C1x, C2x, C5x …..


• Floating point – TMS320C3x, C4x, C67x ….

5.3 TMS320 Family Overview

• The TMS320 family consists of two types of single-chip DSPs:


❑ 16-bit fixed-point and 32-bit floating-point. These DSPs possess the
operational flexibility of high-speed controllers and the numerical
capability of array processors.

• The following characteristics make this family the ideal choice for a wide range of
processing applications:
❑ Very flexible instruction set
❑ Inherent operational flexibility
❑ High-speed performance
❑ Innovative, parallel architectural design
❑ Cost-effectiveness
❑ The ’C5x generation consists of the ’C50, ’C51, ’C52, ’C53, ’C53S, ’C56, ’C57,
and ’C57S DSPs, which are fabricated by CMOS integrated-circuit technology.
❑ Their architectural design is based on the C25.
❑ The operational flexibility and speed of the ’C5x are the result of
combining an advanced Harvard architecture (which has separate buses for
program memory and data memory),
❑ A CPU with application-specific hardware logic, on-chip peripherals,
on-chip memory, and a highly specialized instruction set.
❑ The ’C5x is designed to execute up to 50 million instructions per second
(MIPS).

26
Evaluation of the TMS320 family

Advantages of TMS320
The ’C5x devices offer these advantages:
➢ Enhanced TMS320 architectural design for increased performance
and versatility.
➢ Modular architectural design for fast development of spin-off devices.
➢ Advanced integrated-circuit processing technology for increased performance and
low power consumption.
➢ Source code compatibility with ’C1x, ’C2x, and ’C2xx DSPs for fast and easy
performance upgrades.
➢ Reduced power consumption and increased radiation hardness because of
new static design techniques.
➢ Enhanced instruction set for faster algorithms and for optimized high-level
language operation.

TMS320C5x Key Features

➢ Compatibility: Source-code compatible with ’C1x, ’C2x, and ’C2xx devices


➢ Speed: 20-/25-/35-/50-ns single-cycle fixed-point instruction execution time
(50/40/28.6/20 MIPS)
➢ Power
▪ 3.3-V and 5-V static CMOS technology with two power-down modes
▪ Power consumption control with IDLE1 and IDLE2 instructions for power-down
modes

27
ARCHITECHURE OF TMS320C5X

1.Architecture.
2.Bus Structure & memory.
3.CPU.

4.Addressing Modes.

5.AL Syntax.
Bus Structure

The ’C5x architecture is built around four major buses:

•Program Bus (PB)= carries the instruction code & immediate operands from

program mem. space to CPU.

•Program Address Bus (PAB) = Provides addresses to program memory space for

both read & writes.

•Data Read Bus (DB) = Interconnects various elements of CPU to data memory

space.

•Data Read Address Bus (DAB) = Provides the address to access the data

memory space
28
Central Processing Unit (CPU)
The ’C5x CPU consists of these elements:
▪ Central arithmetic logic unit (CALU)
▪ Parallel logic unit (PLU)
▪ Auxiliary register arithmetic unit (ARAU)
▪ Memory-mapped registers
▪ Program controller

Central Arithmetic Logic Unit (CALU)


The CPU uses the CALU to perform 2s-complement arithmetic. The
CALU consists of these elements:
• 16-bit x 16 -bit multiplier
• 32-bit arithmetic logic unit (ALU)
• 32-bit accumulator (ACC)
• 32-bit accumulator buffer (ACCB)
• Additional shifters at the outputs of both the accumulator and the product
register (PREG)

Parallel Logic Unit (PLU)


• The PLU performs Boolean operations or the bit manipulations required of
high-speed controllers.
• The PLU can set, clear, test, or toggle bits in a status register, control register,
or any data memory location.
• The PLU provides a direct logic operation path to data memory values without
affecting the contents of the ACC or PREG.

Auxiliary Register Arithmetic Unit (ARAU)


• ARs (16 Bit Registers)
• ARP (Auxiliary Register Pointer)
• 16-bit ALU
• ARCR (Auxiliary Register Compare Register)
• AR0-AR7
Memory-Mapped Registers
The memory-mapped registers are used for indirect data address pointers,
temporary storage, CPU status and control, or integer arithmetic processing through
the ARAU.

29
• The ‘C5X has 96 registers mapped into page 0 of the data memory space.
• All ‘C5X DSPs have:
• 28 CPU registers &
• 16 input/output (I/O) port registers but have different numbers of
peripherals & reserved registers.
Program Controller

The program controller consists of these elements:


❑ Program counter
❑ Status and control registers
❑ Hardware stack
❑ Address generation logic
❑ Instruction register

The program controller contains logic circuitry that decodes the


operational instructions, manages the CPU pipeline, stores the
status of CPU operations, and decodes the conditional
operations.

Memory
On - Chip Memory
❑ Program read-only memory (PROM)
❑ Data/program dual-access RAM (DARAM)
❑ Data/program single-access RAM (SARAM)
Memory Space
❑ 64K-word program memory space,
❑ 64K-word local data memory space,
❑ 64K-word input/ output ports,
❑ 32K-word global data memory space.
Program ROM
This memory is used for booting program code from slower external ROM
or EPROM to fast on-chip or external RAM.
Data/Program Dual-Access RAM :

All ’C5x DSPs carry a 1056- word X 16-bit on-chip dual-access RAM (DARAM).
The DARAM is divided into three individually selectable memory blocks:
➢ 512-word data or program DARAM block B0,
➢ 512-word data DARAM block B1,
➢ 32-word data DARAM block B2.
The DARAM is primarily intended to store data values but, when needed, can be
used to store programs as well.
DARAM improves the operational speed of the ’C5x CPU as The CPU operates
with a 4-deep pipeline
30
Data/Program Single-Access RAM :
➢All ’C5x DSPs except the ’C52 carry a 16-bit on-chip single-access RAM (SARAM) of
various sizes
➢Code can be booted from an off-chip ROM and then executed at full
speed, once it is loaded into the on-chip SARAM.

The SARAM can be configured by software in one of three ways:


➢ All SARAM configured as data memory
➢ All SARAM configured as program memory
➢ SARAM configured as both data memory and program memory
On-Chip Peripherals :
All ’C5x DSPs have the same CPU structure; however, they have different on-chip
peripherals connected to their CPUs. The ’C5x DSP on-chip peripherals available
are:
❑Clock generator
❑Hardware timer
❑Software-programmable wait-state generators
❑Parallel I/O ports
❑Host port interface (HPI)
❑Serial port
❑Buffered serial port (BSP)
❑Time-division multiplexed (TDM) serial port
❑User-maskable interrupts
Peripherals
1.Serial Port : Three different kinds of serial ports are available:
▪ a general-purpose serial port,
▪ a time-division multiplexed (TDM) serial port,
▪ a buffered serial port (BSP).
➢ Each ’C5x contains at least one general-purpose, high-speed
synchronous, full-duplexed serial port interface that provides direct
communication with serial devices such as codecs, serial analog-
to-digital (A/D) converters, and other serial systems.
➢ The serial port is capable of operating at up to one-fourth the
machine cycle rate (CLKOUT1).
➢ The serial port transmitter and receiver are double- buffered and
individually controlled by maskable external interrupt signals. Data
is framed either as bytes or as words.
2.Buffered Serial Port (BSP):
▪ The BSP available on the ’C56 and ’C57 devices is a full-duplexed,
double-buffered serial port and an auto buffering unit (ABU).
▪ The ABU supports high-speed data transfer and reduces interrupt
latencies.

31
3. TDM Serial Port:
▪ The TDM serial port available on the ’C50, ’C51, and ’C53 devices is a full-
duplexed serial port that can be configured by software either for synchronous
operations or for time-division multiplexed operations.
▪ The TDM serial port is commonly used in multiprocessor applications.

4. User-Maskable Interrupts:
▪ Four external interrupt lines (INT1 –INT4 )
▪ Five internal interrupts,
▪ A timer interrupt and
▪ Four serial port interrupts, are user maskable.

5. Test/Emulation:

On the ’C50, ’LC50, ’C51,’LC51,’C53, ’LC53, ’C57S and ’LC57S, an IEEE

standard 1149.1 (JTAG) interface with boundary scan capability is used for

emulation and test

6. Clock Generator:

The clock generator consists of an internal oscillator and a phase-locked loop

(PLL) circuit. The clock generator can be driven internally by a crystal resonator

circuit or driven externally by a clock source

7. Hardware Timer:
A 16-bit hardware timer with a 4-bit pre-scaler is available. The timer can be
stopped, restarted, reset, or disabled by specific status bits.

8.Software-Programmable Wait-State Generators:

Software-programmable wait-state logic is incorporated in ’C5x DSPs allowing


wait-state generation without any external hardware for interfacing with slower
off-chip memory and I/O devices.

9. Parallel I/O Ports:


A total of 64K I/O ports are available, sixteen of these ports are memory-
mapped in data memory space. Each of the I/O ports can be addressed by the
IN or the OUT instruction.
10. Host Port Interface (HPI):
The HPI available on the ’C57S and ’LC57 is an 8-bit parallel I/O port that
pro-vides an interface to a host processor.

32
5.4 Addressing Modes
• Direct addressing
• Indirect addressing
• Immediate addressing
• Register addressing
• Dedicated-register addressing
• Memory-mapped register addressing
• Circular addressing

Direct addressing mode

• Operand is always in memory location mem


• Capability to reference data by giving its memory location directly
Instruction Operation
ADD mem mem + A A
• mem: specified memory location provides operand(eg. Memory could
hold input signal value)
• A : accumulator register

Indirect addressing mode

Operand memory location is variable


Operand address is given by the value of register addrreg
Operand accessed using pointer addrreg
Instruction Operation
ADD *addrreg *addrreg+ A A
addrreg: needs to be loaded with the register location before use
A : accumulator register

33
Immediate Addressing mode
Operand is explicitly known in value
Capability to include data as part of the instruction
Instruction Operation
ADD # imm #imm + A A
#imm: value represented by imm (fixed number such as filter coefficient is
known ahead of time)
A : accumulator register

Register addressing mode


Operand is always in processor register reg
Capability to reference data through its register
Instruction Operation
ADD reg reg + A A
reg: processor register provides operand
A : accumulator register

Dedicated Register addressing mode


• The dedicated-registered addressing mode operates like the long immediate
addressing mode, Where address comes from one of two special- purpose
memory-mapped registers in the CPU:

➢ The block move address register (BMAR)


➢ The dynamic bit manipulation register (DBMR).
Memory mapped Register addressing mode
• With memory-mapped register addressing, you can modify the
memory-mapped registers without affecting the current data
page pointer value.

• In addition, you can modify any scratch pad RAM (DARAM B2)
location or data page 0.

• The memory-mapped register addressing mode operates like the direct


addressing mode, except that the 9 MSBs of the address are forced to 0
instead of being loaded with the contents of the DP.

• This allows you to address the memory-mapped registers of data


page 0 directly without the overhead of changing the DP or
auxiliary register.
34
The following instructions operate in the memory- mapped register
addressing mode. Using these
instructions does not affect the contents of the DP:

➢ LAMM — Load accumulator with memory-mapped register

➢ LMMR — Load memory-mapped register

➢ SAMM — Store accumulator in memory-mapped register

➢ SMMR — Store memory-mapped register


Circular addressing mode
• Many algorithms such as convolution, correlation, and finite impulse
response (FIR) filters can use circular buffers in memory to
implement a sliding window, which contains the most recent data to
be processed.

• The ’C5x supports two concurrent circular buffers operating via the
ARs.

• The following five memory-mapped registers control the circular


buffer operation:
➢ CBSR1 — Circular buffer 1 start register
➢ CBSR2 — Circular buffer 2 start register
➢ CBER1 — Circular buffer 1 end register
➢ CBER2 — Circular buffer 2 end register
➢ CBCR — Circular buffer control register
1. To define circular buffers, you first load the start and end addresses into
the corresponding buffer registers;

2. Load a value between the start and end

3. Registers for the circular buffer into an AR.

4. Load the proper AR value, and set the corresponding circular buffer enable
bit in the CBCR.

35
5.5 Programming

To write a machine language program by using DSP processor to add two


numbers.
START:
LDP #100H ; Load the current active data page.
LACC 0H ; Load the accumulator with a data from data page.
ADD 1H ; Add a data in data page with the data in
accumulator.
SACL 5H ; Store the result in a memory location of the current
data page.
.END ; End of the program.

INPUT:
#SD 8000 - 0004
8001 - 0004
#GO C000
Executing....
OUTPUT:
#SD 8005 - 0008

• To write a machine language program by using DSP processor to


multiply two numbers.
START:
LDP #100H ; Load the current active data page.
LT 01H ; Load the T register with a data from the current active
data page.

MPY 02H ; Multiply a data from a memory location in the


current active data page with the data in T register.
PAC ; Transfer the contents of product Register to
accumulator.
SACL 3H ; Store the result in the accumulator at a memory location
in the current active data page.
.END ; End of the program.

36
INPUT:
#SD 8001 - 0005
8002 - 0002
#GO C000 Executing....
OUTPUT:
#SD 8003 - 000A
5.6 Circular Buffering
Digital Signal Processors are designed to quickly carry out FIR filters and similar
techniques. To understand the hardware, we must first understand
the algorithms.
To start, we need to distinguish between off-line processing and real-time
processing. In off-line processing, the entire input signal resides in the computer
at the same time. For example, a geophysicist might use a seismometer to record
the ground movement during an earthquake. After the shaking is over, the
information may be read into a computer and analyzed in some way. Another
example of off-line processing is medical imaging, such as computed tomography
and MRI. The data set is acquired while the patient is inside the machine, but the
image reconstruction may be delayed until a later time. The key point is that all of
the information is simultaneously available to the processing program. This is
common in scientific research and engineering, but not in consumer products.
Off-line processing is the realm of personal computers and mainframes.
In real-time processing, the output signal is produced at the same time that the
input signal is being acquired. For example, this is needed in telephone
communication, hearing aids, and radar. These applications must have the
information immediately available, although it can be delayed by a short amount.
For instance, a 10 millisecond delay in a telephone call cannot be detected by the
speaker or listener. Likewise, it makes no difference if a radar signal is delayed by
a few seconds before being displayed to the operator. Real-time applications input
a sample, perform the algorithm, and output a sample, over-and-over.
Alternatively, they may input a group of samples, perform the algorithm, and
output a group of samples. This is the world of Digital Signal Processors.

Let us consider FIR filter being implemented in real-time. To calculate the output
sample, we must have access to a certain number of the most recent samples from
the input. For example, suppose we use eight coefficients in this filter, a0, a1, … a7.
This means we must know the value of the eight most recent samples from the
input signal, x[n], x[n-1], … x[n-7]. These eight samples must be stored in memory
and continually updated as new samples are acquired. What is the best way to
manage these stored samples? The answer is circular buffering.

37
Let us consider an eight sample circular buffer. This circular buffer is placed in
eight consecutive memory locations, 20041 to 20048. Figure (a) shows how the
eight samples from the input might be stored at one particular instant in time,
while (b) shows the changes after the next sample is acquired. The idea of
circular buffering is that the end of this linear array is connected to its beginning;
memory location 20041 is viewed as being next to 20048, just as 20044 is next to
20045. You keep track of the array by a pointer (a variable whose value is
an address) that indicates where the most recent sample resides. For instance, in
(a) the pointer contains the address 20044, while in (b) it contains 20045. When a
new sample is acquired, it replaces the oldest sample in the array, and the pointer
is moved one address ahead. Circular buffers are efficient because only one value
needs to be changed when a new sample is acquired.
Four parameters are needed to manage a circular buffer. First, there must be a
pointer that indicates the start of the circular buffer in memory (in this example,
20041). Second, there must be a pointer indicating the end of the array (e.g.,
20048), or a variable that holds its length (e.g., 8). Third, the step size of the
memory addressing must be specified. In Fig. 28-3 the step size is one, for
example: address 20043 contains one sample, address 20044 contains the next
sample, and so on. This is frequently not the case. For instance, the addressing
may refer to bytes, and each sample may require two or four bytes to hold its
value. In these cases, the step size would need to be two or four, respectively.
These three values define the size and configuration of the circular buffer, and will
not change during the program operation. The fourth value, the pointer to the
most recent sample, must be modified as each new sample is acquired. In other
words, there must be program logic that controls how this fourth value is updated
based on the value of the first three values. While this logic is quite simple, it
must be very fast. This is the whole point of this discussion; DSPs should be
optimized at managing circular buffers to achieve the highest possible execution
speed.
Circular buffering is also useful in off-line processing. Consider a program where
both the input and the output signals are completely contained in memory.
Circular buffering isn't needed for a convolution calculation, because every sample
can be immediately accessed. However, many algorithms are implemented
in stages, with an intermediate signal being created between each stage.

38
For instance, a recursive filter carried out as a series of biquads operates in
this way. The brute force method is to store the entire length of each
intermediate signal in memory. Circular buffering provides another option:
store only those intermediate samples needed for the calculation at hand.
This reduces the required amount of memory, at the expense of a more
complicated algorithm. The important idea is that circular buffers
are useful for off-line processing, but critical for real-time applications.
Now the steps needed to implement an FIR filter using circular buffers for
both the input signal and the coefficients can be concentrated. This list may
seem trivial and overexamined- it's not! The efficient handling of these
individual tasks is what separates a DSP from a traditional microprocessor.
For each new sample, all the following steps need to be taken:

The goal is to make these steps execute quickly. Since steps 6-12 will be repeated
many times (once for each coefficient in the filter), special attention must be given
to these operations. Traditional microprocessors must generally carry out these 14
steps in serial (one after another), while DSPs are designed to perform them
in parallel. In some cases, all of the operations within the loop (steps 6-12) can be
completed in a single clock cycle.

39
LINK TO VIDEOS:

S.No Topic Link

1 Signal Processing in Home https://youtu.be/LJ54btWttdo


Assistants

2 Signal Processing in MRIs https://youtu.be/akuQWr8q9Qs

3 Signal Processing in Autonomous https://youtu.be/XASSD82BgYY


Vehicles

4 Fixed point and Floating point https://www.youtube.com/watch?v=_kXyr


Architecture IR6HJM

40
6.4 Assignments ( For higher level learning and Evaluation -
Examples: Case study, Comprehensive design, etc.,)
UNIT V – DIGITAL SIGNAL PROCESSORS

CO
Q.No Questions BT Level
Level
Distinguish between off-line processing and real-time
1. CO5 K3
processing with an example
Write an assembly language program to perform circular
2. convolution of two 8-pint sequences using instructions of CO5 K3
TMS320C54x processors

3. Study the features of fixed point and floating point architecture. CO6 K3

4 How DSP are different from other Microprocessors CO5 K3

41
6.5 Part A Q & A (with K level and CO)
UNIT V – DIGITAL SIGNAL PROCESSORS
PART - A
CO BT
Q.No Questions
Level Level
1. Mention the features of the DSP processor

Important features of the DSP processor are


•They often have parallel multiply and add,
•Multiple memory accesses (to fetch two operands and store
CO5 K2
the result),
•Lots of registers to hold data temporarily,
•Efficient address generation for array handling, and
•Special features such as delays or circular addressing.

2. What are the advantages of the VLIW architecture?

The main advantage of the VLIW architecture is the saving


in hardware — the compiler now decides what can be
CO5 K2
executed in parallel, and the hardware just does it. There is
no need to check for dependencies or decide on scheduling
— the compiler has already resolved these issues.

3. What are the basic types of digital signal processors


related by Texas Instruments ?

Texas Instruments has released four basic types of digital


signal processors an they are
•16-bit fixed point processors
•32-bit floating point processors CO5 K2
•VLIW architectures processors
•Multiprocessor digital signal processors

42
UNIT V – DIGITAL SIGNAL PROCESSORS
PART - A
CO BT
Q.No Questions
Level Level
4. What are the different stages in pipelining?
Pipelining divides the instruction in 5 stages
•Instruction fetch,
•Instruction decode, CO5 K2
•Operand fetch,
•Instruction execution and
•Operand store.
5. What is pipelining in DSP ?

The pipelining refers to overlapping of execution of various


phases of different instructions so that a number of
instructions can be executed in parallel. In DSPs, the
CO5 K2
execution of each instruction is divided in 4 or 6 phases. In 4
phase pipelining , when first instruction is in 4th phase of
execution , the second will be in 3rd phase, the third will be
in 2nd phase and fourth will be in 1st phase of execution.

6. What are the factors that influence selections of


DSP?
Factors that influence selections of DSP are
•Arithmetic Format
•Data Width
•Speed
CO5 K2
•Memory Organization
•Ease of Development
•Multiprocessor Support
•Power Consumption and Management
•Cost

43
UNIT V – DIGITAL SIGNAL PROCESSORS
PART - A
CO BT
Q.No Questions
Level Level
7. What are the applications of PDSP’s?
PDSPs are designed mainly for embedded
DSP applications. As such, the user may never realize the
existence of a PDSP in an information appliance. CO5 K2
Important applications of PDSPs include modem, hard
drive controller, cellular phone data pump, set-top box, etc.

8. How is fast data access achieved in digital signal


processors?

In digital signal processor, the fast data access is achieved CO5 K2


by high-band width memory architecture like modified
Harvard architecture , specialized addressing modes like
circular and bit reversed addressing and DMA.
9. What is the difference between Von Neumann and
Harvard architecture ?

The Von Neumann architecture has a single block of memory


to store both code and data, and the memory is connected
to CPU by a single bus, which permits the CPU to access
CO5 K2
memory for either code or data at any one time.

The Harvard architecture has two memory blocks to store


code and data separately and the two memory blocks are
connected to CPU by separate buses for simultaneous access
of code and data.
10. What are the functional units of CPU of TMS320C5x
processors ?

The functional units of CPU of TMS320C5x processors are


•Parallel Logic Unit
CO5 K2
•Central ALU
•Memory mapped registers
•Auxiliary register
•Arithmetic Unit
•Program controller

44
UNIT V – DIGITAL SIGNAL PROCESSORS
PART - A
CO BT
Q.No Questions
Level Level
11. Distinguish between fixed point and floating point
architecture.

S.
No Fixed point Floating point
architecture architecture
1 Fixed-point DSPs are Floating-point DSPs
designed to represent and represent and manipulate
manipulate integers – rational numbers via a
positive and negative minimum of 32 bits in a
whole numbers – via a manner similar to scientific
minimum of 16 bits, notation, where a number is
yielding up to 65,536 represented with a mantissa
possible bit patterns (216) and an exponent yielding up K2
CO6
to 4,294,967,296 possible
bit patterns (232)

2 Fixed Point architecture Floating Point architecture


has lesser precision and has better precision and
low dynamic range high dynamic range

3 In fixed-point notation, the In floating-point notation,


gaps between adjacent gaps between adjacent
numbers always equal a numbers are not uniformly
value of one spaced

12. Mention some of the applications of DSP.


* Voice Processing
* Musical sound Processing CO5 K2
* Audio/Video Processing
* Communication
13. What are the addressing modes of TMS320C5x
processing ?
The TMS320C5x processors supports the following six
addressing modes
1.Direct addressing
CO5 K2
2.Memory-mapped register addressing
3.Indirect addressing
4.Immediate addressing
5.Dedicated-register addressing
6.Circular addressing

45
6.6 Part B Qs (with K level and CO)
PART – B UNIT-V
CO BT
Q.No Questions
Level Level
1. Write an ALP to perform circular convolution through MAC
CO6 K3
operation in TMS320C5x.
2.
Explain in detail about the types of the DSP processors CO6 K2
3
Write in detail about TMS320C5x architecture CO6 K2
4 Explain the various addressing modes of TMS320C5x
CO6 K2
processor with example.
5
Compare fixed and floating point architecture CO6 K2
6 Explain the various instruction sets of TMS320C5x processor
CO6 K2
with example.
7
Explain in detail about VLIW architecture. CO6 K2

46
6.7 Supportive online Certification courses (NPTEL,
Swayam, Coursera, Udemy, etc.,)

ONLINE COURSE NPTEL:

https://swayam.gov.in/nd1_noc19_ee50/

Digital Signal Processing

By Prof.C.S. Ramalingam | IIT Madras

This course will introduce you to the basics of discrete-time sequences, z-


transform, frequency response of discrete-time systems, sampling, and the DFT.
INTENDED AUDIENCE: UG students in ECE/EEE PREREQUISITES: Networks and
Systems INDUSTRY SUPPORT: Jasmine InfoTech

ONLINE COURSE COURSERA

https://www.coursera.org/learn/dsp1

Digital Signal Processing 1: Basic Concepts and Algorithms

INSTRUCTOR

Paolo Prandoni

ONLINE COURSE COURSERA

https://online.stanford.edu/courses/ee264-digital-signal-processing

Digital Signal Processing

STANFORD SCHOOL OF ENGINEERING

47
6.8 Real time Applications in day to day life and to
Industry

1. https://www.analog.com/media/en/technical-documentation/dsp-
book/dsp_book_Ch9.pdf

APPLICATIONS OF DSP

48
6.9 Contents beyond the Syllabus ( COE related Value
added courses)

1. https://www.intechopen.com/books/applications-of-digital-signal-
processing-through-practical-approach/application-of-dsp-in-power-
conversion-systems-a-practical-approach-for-multiphase-drives

Application of DSP in Power Conversion Systems — A Practical Approach for


Multiphase Drives

49
7. Assessment Schedule

Assessment Proposed Date Actual Date


Internal Assessment 1

Internal Assessment 2

Revision Test 1

Model Exam

University Exam

50
8. Prescribed Text Books & Reference Books

TEXT BOOK:

1. John G. Proakis & Dimitris G.Manolakis, ―Digital Signal Processing – Principles,


Algorithms & Applications‖, Fourth Edition, Pearson Education / Prentice Hall,
2007.

2. A. V. Oppenheim, R.W. Schafer and J.R. Buck, ―Discrete-Time Signal Processing‖,


8th Indian Reprint, Pearson, 2010.

REFERENCES:

1. I.C.Ifeachorand B.W.Jervis, Digital Signal Processing A practical approach, Pearson


Education, Wiley & Sons, Singapore 2002.

2. M.H.Hayes,Digital Signal Processing, Schaum Soutlines, TataMcGrawHill,2007

3. A.Nagoor Kani,Digital Signal Processing,McGrawHill Education,Second Edition,2017

4. Salivahanan. S, Digital Signal Processing,McGrawHill Education, Fourth Edition,


2019

5. Andreas Antoniou, ―Digital Signal Processing‖, Tata Mc Graw Hill, 2006.

51
9. Mini Project suggestions

DSP MINI PROJECTS LIST

S.No Name of The Project

1. Image classification by using algorithm k-means clustering

2. Color histogram features based image classification in CBIR systems

3. Robust adaptive kalman filtering based speech enhancement algorithm

4. Fingerprint enhancement using STFT

5. Audio Watermarking using DWT For Authentication Process

6. SVD based Blind Watermarking Algorithm for digital images

7. Video Watermarking using discrete wavelet transform

8. Image enhancement for improving face detection under nonuniform


lighting conditions
9. EEG signal denoising for removing ocular artifacts using wavelets

10. Signal Adaptive Subband Decomposition for Adaptive Noise Cancellation

52
Thank you

Disclaimer:

This document is confidential and intended solely for the educational purpose of RMK Group of
Educational Institutions. If you have received this document through email in error, please notify the
system manager. This document contains proprietary information and is intended only to the
respective group / learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender immediately by e-mail if you
have received this document by mistake and delete this document from your system. If you are not
the intended recipient you are notified that disclosing, copying, distributing or taking any action in
reliance on the contents of this information is strictly prohibited.

53

You might also like