0% found this document useful (0 votes)
2 views5 pages

Low Power Compressor Based MAC Architecture For DSP Applications

The document presents a low power compressor-based Multiply-Accumulate (MAC) architecture designed for Digital Signal Processing (DSP) applications, highlighting the importance of efficient arithmetic logic units in VLSI circuits. The proposed architecture utilizes larger fan-in gates and optimized circuit designs to significantly reduce power consumption and interconnect delays compared to conventional architectures. Results demonstrate that the proposed MAC unit architecture achieves better efficiency in both ASIC and FPGA domains, making it suitable for low power applications.

Uploaded by

badigersuhas5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views5 pages

Low Power Compressor Based MAC Architecture For DSP Applications

The document presents a low power compressor-based Multiply-Accumulate (MAC) architecture designed for Digital Signal Processing (DSP) applications, highlighting the importance of efficient arithmetic logic units in VLSI circuits. The proposed architecture utilizes larger fan-in gates and optimized circuit designs to significantly reduce power consumption and interconnect delays compared to conventional architectures. Results demonstrate that the proposed MAC unit architecture achieves better efficiency in both ASIC and FPGA domains, making it suitable for low power applications.

Uploaded by

badigersuhas5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

View metadata, citation and similar papers at core.ac.

uk brought to you by CORE


provided by International Journal of Innovative Technology and Research (IJITR)

Yeturu Parvathi* et al.


(IJITR) INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND RESEARCH
Volume No.6, Issue No.4, June - July 2018, 8428-8432.

Low Power Compressor Based MAC


Architecture for DSP Applications
YETURU. PARVATHI SIDDU. PENCHALAIAH
Pursuing M.Tech (VLSI&ESD) from SKR College M. Tech, Assistant Professor in Deportment of
of Engineering & Technology, Manubolu, SPSR ECE, SKR College of Engineering & Technology,
Nellore.AP. Manubolu, SPSR Nellore.AP.
Abstract: This paper shows the low power blower based Multiply-Accumulate (MAC) design for DSP
applications. In VLSI, exceptionally registered math cells including adders and multipliers are the most
plentifully utilized parts. Productive usage of math rationale units, skimming point units and other
devoted utilitarian segments are used in the vast majority of the chip and computerized flag processors
(DSPs). Along these lines in this concise, blower circuit has been outlined for the low power applications
and furthermore the effect of datapath circuits has been illustrated. The proposed low power blower
design was connected to MAC unit and looked at against the regular blower based MAC units and
watched that the proposed engineering has decreased critical measure of spillage control.
1.INTRODUCTION multiplier and viper phase of gather is combined
Since the most recent decade the semiconductor utilizing blowers in this work.
business has encountered an exponential Low power blower design is proposed in this brief
development of mix of advanced multi-media to diminish the power utilization of the MAC
applications into convenient devices. The engineering since the nearness of more number of
significant worry of compact devices is the battery blowers. The effect of the circuit configuration
life, which impacts the genuine - time preparing level or the datapath improvements is tended to at
applications and their dynamic scope of the MAC level for DSP applications. In MAC,
information signals for added substance highlights. furthermore the convey engender expansion
It is the high time to investigate the testing criteria engaged with multiplier and gather stages are
of these rising low power, low region and elite converged to endeavor and increment the quantity
computerized flag handling chips [1]. of blowers in the MAC designs. Outlines were
In computerized VLSI circuits, calculation is the represented in ASIC and FPGA spaces according to
basic part and it chooses the power utilization and the standard plan technique. Remaining segments
working rate of the plans. For calculations number of the paper are sorted out as takes after. Blower
juggling circuits includes adders and multipliers; and MAC designs are examined and the restrictions
which are the most abundantly utilized parts. of existing structures are described in area II.
Advanced flag processors performing sifting, Results are assessed in segment III and the paper is
convolution and so forth, depends on the effective closed in segment IV. Last segment gives the
usage of these viper, multiplier and MAC number references.
juggling units. 2.ARCHITECTURE
As the criticality of multipliers chooses the power A. Compressor
utilization and working rate of the computerized Compressors are the digital circuits which have the
circuits, there is potential at circuit configuration capability to add five/six/seven bits at a time and
level to upgrade the power and defer requirements. hence called as column compressors. A typical five
Numerous specialists in the past have created and input compressor is illustrated in this brief. It takes
exhibited a few models to enhance the 4 regular inputs and 1 intermediate carry-in input
effectiveness of the multipliers. Stall encoders and and generates 1 sum bit, 1 carry-out bit and another
its adjustments were created to diminish the intermediate carry bit. Intermediate carry bits are
deferral by decreasing number of columns in the the carry-in and carry-outs (called as horizontal
Partial Product Generation stage. Blowers were carry propagation) from previous and to next stage
used in the halfway item decrease stage to expand compressors. Carry-out (also called as vertical
the increase activity speed [3 - 5]. Integral Pass carry) bit is final carry generated along with the
transistor rationale based adiabatic 8-bit multiplier sum bit.
is composed in [6] to diminish the deferral and
Since compressors forms the basic and critical
power utilization of the multiplier engineering.
components for multipliers and large-input adders,
Vedic sutras were likewise utilized in the multiplier
several compressors architectures were developed
engineering to build the speed of the MAC designs
in the past to address several constraints. Some of
[7]. To decrease the defer facilitate in the MAC
the compressor architectures described in the past
designs, the convey spread expansion phase of
are shown in Fig. 1 & Fig. 2 [8, 9].

2320 –5547 @ 2013-2018 http://www.ijitr.com All rights Reserved. Page | 8428


Yeturu Parvathi* et al.
(IJITR) INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND RESEARCH
Volume No.6, Issue No.4, June - July 2018, 8428-8432.

Thus the proposed compressor architecture enables


new features like design specific/constraint specific
architectures and allows utilizing for low power
applications. Optimizations provided in the
proposed architectures are,
1. Minimum interconnect in sum-path reduces
the interconnect delay and associated glitches
2. Reduced power consumption with minimum
interconnects
Fig. 1: Full Adder based Compressor [8] 3. Independent carry logic to reduce the
Compressor architecture shown in Fig. 1 is built horizontal carry delay
using the full-adders. This architecture has only B.Multiply-Accumulate Unit
two cells and will have minimum interconnects but MAC is the basic and most frequently used
each of the cell needs to generate the sum and carry component in DSP to perform filtering,
path and one of the path is dependent on the other. convolution and etc to accelerate the FIR or FFT
This requires larger drive strength to drive the computations [2]. Regularly MAC unit contain
chain of compressors and hence the power multiplier, adders and registers as shown in Fig. 4,
consumption will be higher. The higher drive where the previous output of the MAC unit is
strength will significantly have the reduced delay. added with the multiplier output and accumulated.
Fig. 2 shows the compressor architecture built
using lesser fan-in gates. Logic implementation
with lesser fan-in gates leads to more number of
interconnects which has significant impact on
glitch power & delay. In lower technological nodes
the interconnect power is dominant than the gate
power, hence the architecture of [9] leads to high
power consumption.

Fig. 4: Regular MAC architecture


Multipliers are implemented in three stages
namely: partial product generation, partial product
reduction and carry propagate addition. Regular
architectures utilize the half and full adders in the
partial product stages, but due to its performance
Fig. 2: David Harris Compressor cell [9] limitation compressor cells were utilized. Some of
Fig. 3 shows the proposed compressor architecture. the past architecture’s reduced the number of
The proposed compressor architecture is built with reduction steps in the partial product reduction
larger fan-in gates and also using separate logics stage by introducing booth encoding in the partial
for sum and carry paths. In the sum path four 2 product generation stage, to reduce overall delay [3
input XOR cells are replaced by two 3 input XOR - 5].
cells and in the carry path two 2 input AND cells & Use of compressors in the multiplier will reduce
one 2 input OR cells are replaced by one 6 input the number of gates for implementation which
AND-OR (AO222) logic cell. Larger fan-in gates inturn reduces the number of interconnects. This
covers large part of the logics and helps in results in reduced interconnect delay and glitches
minimizing the number of gates required for associated with-it, yielding a low power design.
implementation. Lesser gates lead to smaller area Thus the efficient multiplier will improve the
and minimum interconnect delays. Thus the efficiency the MAC unit.
proposed compressor architecture helps in reducing The use of circuit level design specifically designed
the power consumption. for particular constraint will be more efficient in
ASIC designs. For example the use of proposed
low power compressor architecture improves the
power efficiency and suits for low power
applications. To demonstrate the impact of
compressor architecture a MAC unit architecture
which contains more number of compressors is
chosen from [2].
Fig. 3: Proposed compressor Cell In [2], author has used the compressors in
multipliers in the partial product reduction and in

2320 –5547 @ 2013-2018 http://www.ijitr.com All rights Reserved. Page | 8429


Yeturu Parvathi* et al.
(IJITR) INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND RESEARCH
Volume No.6, Issue No.4, June - July 2018, 8428-8432.

accumulation stage of the MAC unit, where the more, due to which the area required is more and
carry propagate stage of the multiplier is merged the dynamic power consumption is also higher.
with the input of accumulate add stage. Fig. 5 More number of interconnects and less fan-in
shows the state of the art MAC architecture. logic gates has increased the delay and power
Totally 29 compressors were utilized to implement consumption of the compressor architecture of
the MAC unit of Fig. 5. Other than compressors, [9].
half and full adders were also required to Only two cells in compressor architecture of [8]
implement. reduces the interconnect delay and is reflected as
reduced delay. The dependency of one among the
sum and carry path of full adder requires higher
drive strength to drive the signal faster; resulting
in higher power consumption than the proposed
compressor architecture.
As the proposed compressor architecture utilizes
the larger fan-in gates, its transistor stack will be
higher causes to have higher resistance between
the power supplies and results in reduced leakage
power. Since the proposed architecture generates
the sum and carries simultaneously; it doesn’t
require higher drive strength signal.
Table 2 shows the results of the MAC units with
Fig. 5: State of the art MAC architecture [2] conventional and proposed compressor
Both the conventional and proposed compressor architectures. Similar to Table 1, results at the
architectures were applied in the state of the art MAC level also yielded the efficient results. Here
MAC architecture, to illustrate the impact of also a significant amount of power consumption
compressor architectures. has been reduced of the proposed MAC unit
3.RESULTS & DISCUSSIONS having proposed compressor architectures. This
Both the regular and proposed architectures at the suggests that the proposed architecture designed
compressor and MAC unit level were designed and specifically towards power constraint has behaved
modeled using Verilog HDL. Designs were similar at the cell and at the sub-systems level. It
functionally verified using Mentor graphics Model- also proves that the optimizations at the circuit
sim simulator using waveform editor and were design level will have impact at the sub-system
synthesized by targeting to TSMC’s 65nm level. From these it can be encouraged that the
technological library node using Cadence RTL optimizations at the circuit design level can be
compiler. The designs were also synthesized under applied to any level of hierarchical abstractions.
FPGA domain by targeting the virtex 7 device. Further the proposed architecture can be
Results of the compressor and MAC units were generalized for any bit-width and at any level of
benchmarked as per the standard design abstraction in the design hierarchy.
methodology for both ASIC and FPGA domains. Table 2: Comparison of the synthesis results of
Table 1: Comparison of the synthesis results of MAC architectures using existing and
existing and proposed compressor architectures proposed Compressor architectures in ASIC
domain

The circuit level design optimization was also


Table I shows the results of the regular and illustrated in the FPGA design and the synthesis
proposed compressor architectures. It can be results are tabulated in Table 3. In FPGA domain
observed that the proposed compressor the designs were targeted to Virtex 7 family. It
architecture is more efficient in all the design can be observed from the Table 2 and Table 3 that
parameters against the architecture of [9]. As designs behave differently in different domains
mentioned in the architecture section, the large due to different mapping logics and hence it
number of less fan-in gates requires more number suggests that the optimizations should be domain
of gates and number of interconnects will be specific. In ASIC the logics are mapped to

2320 –5547 @ 2013-2018 http://www.ijitr.com All rights Reserved. Page | 8430


Yeturu Parvathi* et al.
(IJITR) INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND RESEARCH
Volume No.6, Issue No.4, June - July 2018, 8428-8432.

standard cells of the libraries and in FPGA 4.CONCLUSION


domain the logics are mapped to up tables Design and domain specific low power
(LUTs). Table 3 shows that the proposed compressor based MAC architecture has been
architecture has better results than the existing demonstrated in this work. The importance of
architectures in FPGA domain. circuit design level and its impact for DSP
Existing compressor architecture of [8] has one applications is addressed. Use of higher fan-in
interconnect and three outputs, hence it requires gates and its merits are discussed for the low
four LUTs to implement one compressor cell. power applications. The proposed architectures
Since the logics of proposed compressor have yielded better efficiencies in the ASIC and
architecture has been implemented parallelly; FPGA domain when modeled in Verilog HDL
(parallel sum and carry logics) interconnect has and synthesized with Cadence RTL compiler and
been avoided to reduce the LUT requirements to 3 Xilinx ISE respectively. Designs were mapped to
against the 4 numbers of existing compressor TSMC’s 65nm technology node and Virtex 7
architecture. FPGA family respectively.
Table 3: Comparison of the synthesis results of 5.REFERENCES
MAC architectures using existing and [1]. Chang, Chip-Hong, Jiangmin Gu, and
proposed Compressor architectures in FPGA Mingyan Zhang. "Ultra low-voltage low-
domain power CMOS 4-2 and 5-2 compressors for
fast arithmetic circuits." Circuits and
Systems I: Regular Papers, IEEE
Transactions on 51.10 (2004): 1985-1997.
[2]. Tung Thanh Hoang; Sjalander, M.;
Larsson-Edefors, P., "A High-Speed,
Energy-Efficient Two-Cycle Multiply-
Accumulate (MAC) Architecture and Its
Application to a Double-Throughput MAC
Unit," Circuits and Systems I: Regular
Papers, IEEE Transactions on , vol.57,
no.12, pp.3073,3081, Dec. 2010.
As more number of compressors are required in [3]. Chen Ping-hua; Zhao Juan, "High-speed
the MAC architecture, the proposed MAC Parallel 32×32-b Multiplier Using a Radix-
architecture requires less number LUTs and it 16 Booth Encoder," Intelligent Information
constitutes to lesser interconnects and resulted in Technology Application Workshops, 2009.
the reduced delay against the existing MAC IITAW '09. Third International Symposium
architecture with compressor architecture of [8]. on , vol., no., pp.406,409, 21-22 Nov. 2009
Since the numbers of LUTs are higher in existing
[4]. Kiwon Choi; Minkyu Song, "Design of a
MAC architecture and as per the relation larger the
high performance 32×32-bit multiplier with
area; higher will be power consumption, the power
a novel sign select Booth encoder," Circuits
consumption of the existing MAC architecture is
and Systems, 2001. ISCAS 2001. The 2001
higher than the proposed MAC architecture. More
IEEE International Symposium on , vol.2,
number of interconnects also contributes to power
no., pp.701,704 vol. 2, 6-9 May 2001.
consumption. Thus the parallelism in the proposed
architecture has better efficiency than the exiting [5]. Rajput, R.P.; Swamy, M.N.S., "High Speed
architectures. Further improvements can be Modified Booth Encoder Multiplier for
obtained by designing as per the FPGA Signed and Unsigned Numbers," Computer
architectures. Modelling and Simulation (UKSim), 2012
UKSim 14th International Conference on ,
From the results of Table 2 and Table 3, it can be
vol., no., pp.649,654, 28-30 March 2012.
suggested that the proposed architecture holds
good and true for both ASIC and FPGA domains. [6]. Yangbo Wu; Weijiang Zhang; Jianping Hu,
It can also from the above result tables discussions "Adiabatic 4-2 compressors for low-power
that the proposed architecture can be generalized multiplier," Circuits and Systems, 2005.
for n-bit MAC and are independent of Number 48th Midwest Symposium on , vol., no.,
Representation (Radix, Base) & Bit Width. pp.1473,1476 Vol. 2, 7-10 Aug. 2005.
Increase in the MAC bit-width, requires more [7]. Jaina, D.; Sethi, K.; Panda, R., "Vedic
number of compressors and this optimization Mathematics Based Multiply Accumulate
impact will be higher. Approximately the increase Unit," Computational Intelligence and
in bit-width size from N-bits to 2N-bits, the Communication Networks (CICN), 2011
number of compressors would be increased by International Conference on, vol., no.,
approximately 5 times. pp.754,757, 7-9 Oct. 2011.

2320 –5547 @ 2013-2018 http://www.ijitr.com All rights Reserved. Page | 8431


Yeturu Parvathi* et al.
(IJITR) INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND RESEARCH
Volume No.6, Issue No.4, June - July 2018, 8428-8432.

[8]. Aliparast, Peiman, Ziaadin D.


Koozehkanani, and Farhad Nazari. "An
Ultra High Speed Digital 4-2 Compressor
in 65-nm CMOS." International Journal of
Computer Theory & Engineering 5.4
(2013).
[9]. N. Weste and David Harris, “CMOS VLSI
Design- A Circuits & System Perspective”,
Pearson Education, 2008.
[10]. ChandraMohan U, “Low Power Area
Efficient Digital Counters”, Proceedings of
the 7th VLSI Design and Test Workshops,
VDAT, August 2003.
[11]. Narendra C P & Ravi K M Kumar,
“Efficient Comparator based Sum of
Absolute Differences Architecture for
Digital Image Processing Applications”,
Foundation of Computer Science, New
York, USA, International Journal of
Computer Applications, 96(4):17-24, June
2014.
AUTHOR’s PROFILE
Yeturu. Parvathi, Pursuing M.Tech
(VLSI&ESD) from SKR College of
Engineering & Technology,
Manubolu, SPSR Nellore.AP.

Siddu. Penchalaiah, M.tech,


Assistant Professor in Deportment of
ECE, SKR College of Engineering &
Technology, Manubolu, SPSR
Nellore.AP.

2320 –5547 @ 2013-2018 http://www.ijitr.com All rights Reserved. Page | 8432

You might also like