

# A Review Articles of Booth Multiplier

Divya Rathore Asst. Prof. Priyanshu Pandey

Dept. of Electronics & Communication Engineering (VLSI stream)
Patel College of Science and Technology
Indore, MP, India

Abstract- The aim of this paper is give the review of different type of implementation of multiplier has been studied. Multiplier has important role in DSP, microprocessor and microcomputer applications. In this paper booth algorithm is used to design the multiplier but it suffers many limitations such that number of partial products increases, so, area, height and latency is also increases. In this review paper we analyzed modified Booth algorithm to design the multiplier so that the partial product is going to be decreases. Different type of algorithm are also explained which are used for addition operation of multiplier. In the latest designs of VLSI, power dissipation is a main advantageous to reduce it.

Keyword - Booth Multiplier, Modified Booth Multiplier, Adder, VLSI.

#### I. INTRODUCTION

As the scale of integration keeps growing, more and more sophisticated signal processing systems are being implemented on a VLSI chip. These signal processing applications not only demand great computation capacity but also consume considerable amount of energy. While performance and Area remain to be the two major design tolls, power consumption has become a critical concern in today's VLSI system design. The need for low-power VLSI system arises from two main forces. First, with the steady growth of operating frequency and processing capacity per chip, large currents have to be delivered and the heat due to large power consumption must be removed by proper cooling techniques.

Second, battery life in portable electronic devices is limited. Low power design directly leads to prolonged operation time in these portable devices. Multiplication is a fundamental operation in most signal processing algorithms. Multipliers have large area, long latency and consume considerable power. Therefore low-power multiplier design has been an important part in low power VLSI system design. There has been extensive work on low-power multipliers at technology, physical, circuit and logic levels. A system's performance is generally determined by the performance of the multiplier because the multiplier is generally the slowest element in the system. Furthermore, it is generally the most area consuming. Hence, optimizing the speed and area of the multiplier is a major design issue.

However, area and speed are usually conflicting constraints so that improving speed results mostly in larger areas. As a result, a whole spectrum of multipliers with different area speed constraints has been designed with fully parallel. Booth's Algorithm is a smart move for multiplying signed numbers. It initiate with the ability to both add and subtract there are multiple ways to compute

a product [5]. Booth's algorithm is a multiplication algorithm that utilizes two's complement notation of signed binary numbers for multiplication [9]. Earlier multiplications were in general implemented via sequence of addition then subtraction, and then shift operations. Multiplication can be well thought-out as a series of repeated additions. The number which is to be added is known as the multiplicand, and the number of times it is added is known as the multiplier, and the result we get is the multiplication result. After Each step of addition a partial product is generated. When the operands are integers, the product in general is twice the length of operands in order to protect the information content. This repetitive addition method that is recommended by the arithmetic definition is slow as it is always replaced by an algorithm that makes use of positional depiction.

We can decompose multipliers into two parts. The first part is committed to the generation of partial products, and the second part collects and then adds them. The fundamental multiplication principle is twofold i.e. evaluation of partial products and gathering of the shifted partial products. It is performed by the consecutive additions of the columns of the shifted partial product matrix. The multiplier is effectively shifted and gets the proper bit of the \_multiplicand'. The delayed, gated case of the multiplicand must all be in the same column of the shifted partial product matrix. Then they are added to form the product bit for the particular form. Multiplication is thus a multi operand operation. To expand the multiplication to both signed and unsigned numbers, a suitable number system would be the depiction of numbers in two's complement format.

Continuous advances of microelectronic technologies make better use of energy, encode data more effectively, transmit information more reliable, etc. Particularly, many of these technologies address low-power consumption to

meet the requirements of various portable applications [5]. In these application systems, a multiplier is a fundamental arithmetic unit and widely used in circuits. VHDL is one of the common techniques for the digital system emergent process. The technique is done by program using certain software which performs simulation and examination of the designed system. The designer only needs to describe his digital circuit design in textual form which can remove without the effort to alter the hardware. VHDL is more preferred because this technique can reduce cost and time, easy to troubleshoot, portable, a lot of platform software support the VHDL function and high references availability. All the processes will be running using Xilinx ISE 8.2i software which means the process is simulated only without any hardware implementation. Multiplication is a fundamental operation in most signal processing algorithms. Multipliers have large area, long latency and consume considerable power. Therefore low-power multiplier design has been an important part in low- power VLSI system design [6]. Fast multipliers are essential parts of digital signal processing systems. The speed of multiplier operation is of great importance in digital signal processing as well as in the general purpose processors today.

The basic multiplication principle is twofold i.e., evaluation of partial products and accumulation of the shifted partial products. Multiplication is an important part of real-time digital signal processing (DSP) applications ranging from digital filtering to image processing. Lowering down the power consumption and enhancing the processing performance of the circuit designs are undoubtedly the two important design challenges of wireless multimedia and DSP applications, in which multiplications are frequently used for key computations, such as FFT, DCT, quantization, and filtering. All multiplication methods share the same basic procedure - addition of a number of partial products. A number of different methods can be used to add the partial products. The simple methods are easy to implement, but the more complex methods are needed to obtain the fastest possible speed.

The simplest method of adding a series of partial products is shown in Figure 1.1. It is based upon an adder-accumulator, along with a partial product generator and a hard wired shifter. This is relatively slow, because adding N partial products requires N clock cycles. The easiest clocking scheme is to make use of the system clock, if the multiplier is embedded in a larger system. The system clock is normally much slower than the maximum speed at which the simple iterative multiplier can be clocked, so if the delay is to be minimized an expensive and tricky clock multiplier is needed, or the hardware must be self-clocking.



Fig. 1 Tricky clock multiplier.

In this Thesis Modified Booth Algorithm is used with detection logic, which will suppress the needless power in the circuit and also speed has been increased. Booth Algorithm also reduces the hardware size of the circuit by reducing the partial product by half.

## II. LOW POWER MULTIPLIERS

It is a very challenging problem for the hardware designers to develop low power, high speed and area efficient portable electronic design. Mobile phones, smart cards such as hearing aids and PDAs are the example of portable consumer electronic products. It is the main concern for operating hours of the battery and residing in it but also greater computational capacity. At the circuit level voltage scaling, threshold voltage, Transistor sizing, network restructuring power down strategies and logic style are used to achieve low power. In addition to this, this technique also contributes to the reduction of propagation delay and area occupancy as well.

Digital Signal Processors (DSPs) are used to perform the most common operations such as video processing, filtering and fast flourier transform (FFT). Such modules perform an extensive sequence of multiply and accumulate computations. Multiplication is the most fundamental operation of digital computer systems and digital signal processors. Multiplication consists of three steps: generation of partial products or (PPG), reduction of partial products (PPR), and finally carry-propagate addition (CPA).In general there are sequential and combinational multiplier implementations. We only consider combinational case here because the scale of integration now is large enough to accept parallel multiplier implementations in digital VLSI systems. Different multiplication algorithms vary in the approaches of PPG, PPR, and CPA. For PPG, radix-2 is the easiest.

To reduce the number of PPs and consequently reduce the area/delay of PP reduction, one operand is usually recoded into high-radix digit sets. The most popular one is the radix-4 digit set {-2,-1, 0, 1, 2}. For PPR, two alternatives exist: reduction by rows, performed by an array of adders, and reduction by columns, performed by an array of counters. The final CPA requires a fast adder scheme because it is on the critical path. In some cases, final CPA is postponed if it is advantageous to keep redundant results from PPG for further arithmetic operations.

A large number of transistors with high switching transitions is used to perform a variety of multiplication operations. In 64 point radix-4 pipelined FFT processor the multiplier consumes 30% power and also occupies 46% chip area. Multiplier is most critical, power hungry arithmetic unit that requires more area and Computational time. Array based multipliers consumes low power as compared to Wallace tree multipliers. In order to improve the performance in tree based multiplier the additional hardware is required, but at the cost of increased layout and parasitic. On the other side, array multiplier has smaller and regular layout. Therefore, array multiplier is a better choice due to its optimized with lesser hardware as small area leads to less switching transitions. An Adder is the fundamental unit of the multiplier and it has significant impact on the overall performance of the system for power dissipation, delay and area occupancy. In this paper, array multiplier is proposed to achieve low power and high speed multiplication operation.

#### III.POWER OPTIMIZATION

Power refers to number of Joules dissipated over a certain amount of time whereas energy is the measure of the total number of Joules dissipated by a circuit. In digital CMOS design, the well-known power-delay product is commonly used to assess the merits of designs. In a sense, this can be shown as

Power  $\times$  delay = (energy/delay)  $\times$  delay = energy, which implies delay is irrelevant [].

## IV. MULTIPLIERS PARAMETERS

Multiplier is an essential component in mostly each and every DSP system. Multiplier is one of the slowest and large area required elements in DSP applications which determine the overall performance of the system. In today's era faster devices with optimized power consumption are the requirement of every consumer [2]. Hence, enhancing the speed, power and area of the multiplier are main design goals to fabricate a DSP system. Enhancing speed results mainly in larger area due to tradeoff between area and speed. If speed and power consumption of components of the devices could be enhanced then overall performance of the devices will be increased ultimately. In many digital circuits, multiplier

consumes most of the power and will produce lag [1]. There are various types of multipliers and each multiplier is described by different algorithm and structures. These multipliers also have different performance parameters and each one of them can be again optimized to get better performance parameters [1] e.g. serial multiplier, parallel multiplier, array multiplier, booth multiplier and Wallace tree multiplier. The objective of a good multiplier is to provide a physically packed together, low power consumption and high speed unit. In this paper, some techniques have been discussed which result in low power multipliers. In the last section SPST based Wallace tree multiplier has been described.

Parameters of a Multiplier An efficient multiplier should have the following characteristics:

- 1. **Area-** A multiplier should occupy less number of slices and LUTs.
- **2.** Accuracy- A good multiplier should give correct results.
- **3. Speed-** Multiplier should perform an operation at high speed.
- **4 Power-** Multiplier should consume less power. Multiplication process has three main steps [2]:
- Partial product generation.
- Partial product reduction.
- Final addition
- 5. Binary multiplication consists of three basic steps:
- Generation of partial products (PP).
- Reduction of partial products.
- Carry propagation addition (CPA).

### V. ISSUES OF OLD ARTICLES

Sakthivel.B "Implementation of Booth Multiplier And Modified Booth Multiplier": This paper describes implement of 8-bit Modified Booth Multiplier and this Implementation is compared with 4-bit Booth Multiplier. Modified Booth's algorithm employs both addition and subtraction and also treats positive and negative operands uniformly. No special actions are required for negative numbers. In this Paper, we investigate the method of implementing the parallel MAC with the smallest possible delay. Parallel MAC is frequently used in digital signal processing and video/graphics applications. A new architecture of multiplier and accumulator(MAC) for high speed arithmetic by combining multiplication with accumulation. Modified Booth multiplication algorithm is designed using high speed adder. High speed adder is used to speed up the operation of multiplication.

Rishit Patel, "Implementation Of High Speed And Low Power Radix-4 8\*8 Booth Multiplier In Cmos 32nm Technology": According to Moore's law, number of transistors integrated on a single chip double every 18 months with a lot new functionality embedded, which results the increasing of delay and power consumption of a chip. To improve the performance of a more

complicated digital circuit design, faster and power efficient digital sub-components are in urgent need. Multipliers are the key components in the field of DSP, GPU and CPU which compute enormous amount of binary data. A radix-4 8\*8 booth multiplier is proposed and implemented in this thesis aiming to reduce power delay product. Four stages with different architecture are used to implement this multiplier rather than traditional 8\*8 booth multiplier. Instead of using adder in stage-1, it is replaced with binary-to-access one converter circuit and 10-bit MUX 2:1 to reduce power consumption by 23.76% and increase speed by 12.02% compared to stage-1 of traditional 8\*8 booth multiplier. This proposed design is implemented in CMOS 32nm technology at 1.0 voltage supply. The worst-case delay of the proposed radix-4 8\*8 booth multiplier at 2 Giga data rate is 423 picosecond and power consumption of 0.274 milli-watt with transistor count of 2860.

Vasudeva G1 And Venkatesh S N2, "Designing Of Modified Booth Encoder With Power Suppression Technique": The algorithm of booth multiplier furnishes a level to formulate a multiplier with greater efficacy & speed. Some sort of operation is done on the fragmentary negative elements. This algorithm gives a better level of encoding in the commencing stage of multiplication of 8 & 4-radix. Also outcomes of delay & LUTs are improvised by deploying pipelining. As the information is relayed through pipelines, the data is enhanced as well in a parallel form. Thus the outcomes are enhanced by making use of terminology of pipeline.

The project titled "Designing of Modified Booth Encoder with power suppression technique" is useful technique in reducing the area, delay and power consumption. Recent studies in designing of DSP systems have revealed loss of high performance due to huge area utilization and delay. In Radix-4 Modified Booth encoder the partial products has been reduced to half of the earlier one. Hence the number of adders are reduced, so the area consumption will be reduced and delay in the output also reduced by a factor. The power Suppression technique used along with the Modified booth Encoder is SPST (Spurious Power Suppression Technique). By using this technique the glitches can be reduced and avoids unwanted operation such as addition of repeated zeros.

Therefore the power consumption will be less. Hence this project is useful in the field of signal processing and also very useful in portable systems. As integrated circuit technology has improved to allow more and more components on a chip, digital systems have continued to grow in complexity. As digital systems have become more complex, detailed design of the systems at the gate and flip-flop level has become very tedious and time consuming. For this reason, use of hardware description languages in the digital design process continues to grow

in importance. A hardware description language allows a digital system to be designed and debugged at a higher level before conversion to the gate and flip-flop level. Digital signal processing is one of the core technologies in rapidly growing application areas such as wireless communications, audio and video processing, and industrial control. Digital signal processing (DSP) applications constitute the critical operations which usually involve many multiplications.

k. srishylam, prof. syed amjad ali, m.praveena, "implementation of hybrid csa, modified booth algorithm and transient power minimization techniques in dsp/multimedia applications": In this section we introduced the above three technologies to encounter the unnecessary power dissipation problems Hybrid CSA is mostly adopted in Multiplier circuits. Modified Booth Encoding is adopted in Multipler and VMFU. Transient power minimization Method is applicable for Multipler, VMFU and ETD(H.264) The method of Hybrid CSA is best understood by applying it to Multipliers[1]. Fast multipliers are essential parts of digital signal processing systems. The speed of multiply operation is of great importance in digital signal processing as well as in the general purpose processors today, especially since the media processing took off. In the past multiplication was generally implemented via a sequence of addition, subtraction, and shift operations. Multiplication can be considered as a series of repeated additions. The number to be added is the multiplicand, the number of times that it is added is the multiplier, and the result is the product. Each step of addition generates a partial product. In most computers, the operand usually contains the same number of bits. When the operands are interpreted as integers, the product is generally twice the length of operands in order to preserve the information content.

Minu Thomas, "Design and Simulation of Radix-8 Booth Encoder Multiplier for Signed and Unsigned Numbers": The multiplication operation is present in many parts of a digital system or digital computer, most notably in signal processing, graphics and scientific computation. With advances in technology, various techniques have been proposed to design multipliers, which offer high speed, low power consumption and lesser area. Thus making them suitable for various high speeds, low power compact VLSI implementations. These three parameters i.e. power, area and speed are always traded off. This thesis work is devoted for the design and simulation of Radix-8 Booth Encoder multiplier for signed-unsigned numbers. The Radix-8 Booth Encoder circuit generates n/3 the partial products in parallel. By extending sign bit of the operands and generating an additional partial product the signed of unsigned Radix-8 Booth Encoder multiplier is obtained. The Carry Save Adder (CSA) tree and the final Carry Look ahead (CLA)

adder used to speed up the multiplier operation. Since signed and unsigned multiplication operation is performed by the same multiplier unit the required hardware and the chip area reduces and this in turn reduces power dissipation and cost of a system. Verilog coding of multiplier for signed and unsigned numbers using Radix-4 booth encoder and Radix-8 booth encoder for 8X8 bit multiplication and their FPGA implementation by Xilinx Synthesis Tool on Spartan 3 kit have been done. The output has been displayed on LED of Spartan 3 kit.

Our Base paper name is High speed Modified Booth Encoder multiplier with signed and unsigned numbers. The paper is working for multiplication of signed and unsigned numbers. We have much multiplication technique for multiplied the two numbers. Present Modified Booth Encoder multiplier and Baugh-Woolley multiplier working only for signed number multiplication. Now array multiplier and Braun array multiplier is [10] working only for unsigned numbers. Today we have required a combined multiplier for design ALU so that the we can increase the speed of Microprocessor.

# VI. CONCLUSION

This paper presents the review of various high speed multiplier with different technique and different type of adders are used. This paper presents the design of review of different methods with high-accuracy fixed-width modified Booth multipliers. To reduce the chip area and delay, firstly slightly modify the partial product matrix of Booth multiplication and then calculated an effective area that reduced the delay, fixed-width modified Booth multiplier to very small mean errors.

# REFERENCES

- [1] Andrew D. Booth. A signed binary multiplication technique. The Quarterly Journal of Mechanics and Applied Mathematics, Volume IV, Pt. 2, 1951.
- [2] Divya Govekar, Ameeta Amonkar, "Design and implementation of High Speed Modified Booth Multiplier using Hybrid Adder", IEEE international Conference On Computing Methodologies and Communication (ICCMC), pp.978-1-5090-4890,2017.
- [3] Jyotikalia et.al. "A review of different methods of booth multiplier."International journal of engineering research and applications(IJERA), Volume 7, (2017): 2248-9622.
- [4] Luo, Tao, et al. "A racetrack memory based inmemory booth multiplier for cryptography application."Design Automatin Conference (ASP-DAC), 2016 21st Asia and South Pacific. IEEE, 2016. [5] Elisardoantelo et.al. "Improved 64-bit radix-16 Booth Multiplier Based on Partial Product Array height Reduction. "IEEE Transaction,2016.

- [6] G.Haridas, David et. al. " Area Efficient low Power Modified Booth Multiplier For FIR Filter. " International Conference On Emerging Trends In Engineering, Science And Technology (ICETEST), Volume 24, (2016): 1163-1169.
- [7] R. Balakumaran, E. Prabhu "Design of high speed multiplier using modified booth algorithm with hybrid carry look-ahead adder", IEEE international Conference on Circuit, Power and Computing Technologies (ICCPCT) pp:1-7, 2016.
- [8] LiangyuQian, Chenghua Wang, Weiqiang Liu, Fabrizio Lombardi, JieHan" Design and evaluation of an approximate Wallace-Booth multiplier", IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1974-1977, 2016.
- [9] HonglanJiang, FeiQiao et. al." Approximate Radix-8 Booth Multipliers For Low-Power and high-Performance operation. IEEE Transaction 2015.
- [10] Nagarjuna et.al. "A Novel Architecture implementation Of Fir Filter Using Booth Multiplier.
   "International journal of Industrial Electronics And Electrical Engineering, Volume 2, (2014):2347-6982.
- [11] B.S, Diana et. al. "Modified Booth Multiplier With FIR Filter." International Journal Of Science And Research (IJSR), Volume 3, (2014):2319-7064.