

## Design of High Speed ALU Using Adaptive Logic

T.Anjaiah<sup>1</sup> M.Tech & A.Kavitanya<sup>2</sup> B.Tech

<sup>1</sup> Assistant Professor Department of ECE, Aditya College of Engineering & Technology, Surampalem, Andhra Pradesh, India.

<sup>2</sup> Department of ECE, Aditya College of Engineering & Technology, Surampalem, Andhra Pradesh, India.

#### Abstract:

In this paper, we introduce high speed architecture for 32-bit ALU using Adaptive logic technique. Adaptive logic is one of the fastest and innovative logic that has been implemented in digital circuit . Adaptive logic is implemented the CMOS technology. It works very using effectively in both threshold and sub - threshold regions which uses Timing-error-detection (TED)based systems is been shown to reduce power consumption or increase yield due to reduced margins. Reducing voltage in the circuit results in slow operation that incurs more delay. Generally delay is caused due to slow operation that results in error based upon the conditions. Canary circuit has been designed for error detection and error correction approach for reducing the power and voltage in a digital circuit. Adaptive logic, which is nothing but modified canary circuit with add-on components to canary circuit have been designed with dual latch phase in each stage. A combination of XOR gate and flip-flop around each stage is added for the verification of correct operation. The entire architecture was modeled using Verilog HDL with the help of XILINX ISE tool.

#### Keywords

TED, EDS, XOR.

#### **1. Introduction**

Day by day IC technology is getting more complex in terms of design and its performance analysis. A faster design with low power consumption and small area is implicit to modern electronic devices. In VLSI, Energy efficiency has emerged as a critical design requirement. In order to obtain the maximum power savings it is essential to scale the Maximum possible supply voltage that results in correct operation. Means if there is insufficient voltage to the circuit the process of operation of that circuit will be slow so if we give sufficient voltage correct operation will be observed in the circuit. Many energy efficient design techniques have been proposed in an efficient way of reducing energy consumption. Reducing voltage in

the circuit results in slow operation that incurs more delay (main drawback of reducing voltage). Every day new approaches are being developed to design low-power at technological, physical, circuit and logic levels. Several techniques such as pipelining, parallel processing have been proposed to allow large reduction in voltage. But these techniques involve sequential elements into the circuit and divide a particular task into 'N' subtasks these results in large area design. To reduce the area new technique named timing margin is been implemented to overcome those drawbacks in the previous method. A major challenge in timing margin reduction methodologies is the increased probability of timing errors due to variations. In general, the variations can be categorized into two types:

1) spatial and

2) temporal variations.

Transistors on a die experience two types of spatial variations:

- 1. Global variation
- 2. Local variation

Global variation mostly affects the electrical characteristics of the devices on a die in the same way. On the other hand, local variation affects the transistor characteristics in more unpredictable way due to randomness. Temporal variation also has two types.

1. Static variation

2. Temporal variation

The amount of static variations is decided during the fabrication period and it does not change with time. On the other hand, temporal variation occurs due to environmental changes, such as temperature, supply voltage noise, and aging cause the transistors to experience variability depending on time. To accommodate the potential increase in circuit delay caused by the variations, more timing margin is given in traditional design approaches. In this, when performance of circuits is compared, it is always done in terms of circuit speed, size and power. A good estimation of the circuit's size is to count the total number of gates used. The actual chip size of a circuit also depends on how the gates are placed on the chip – the circuit's layout. Since we do not deal with layout in this report, the only thing we can say about this is that regular circuits are usually smaller than non-regular ones (for the same number



of gates), because regularity allows more compact layout. The physical delay of circuits originates from the small delays in single gates, and from the wiring between them. The delay of a wire depends on how long it is. Therefore, it is difficult to model the wiring delay; it requires knowledge about the circuit's layout on the chip. The gate delay however, can easily be modelled by saying that the output is delayed a constant amount of time from the latest input. What we can say about the wiring delay is that larger circuits have longer wires, and hence more wiring delay. It follows that a circuit with a regular layout usually has shorter wires and hence less wiring delay than a non-regular circuit. Therefore, if circuit delay is estimated as the total gate delay, one should also have in minded the circuit's size and amount of regularity, when comparing it to other circuits. "Delay" usually refers to the "worst-case delay". That is, if the delay of the output is dependent on the inputs given, it is always the largest possible output delay that sets the speed. Furthermore, if different bits in the output have different worst case delays, it is always the slowest bit that sets the delay for the whole output. The slowest path between any input bit and any output bit is called the "critical path". If a circuit is to be speed up, it is always the critical path that should be attacked in the first place.

#### 2. Block Diagram :



Figure1: block diagram of ALU

An ALU is a combinational logic circuit, meaning that its outputs will change asynchronously in response to input changes. In normal operation, stable signals are applied to all of the ALU inputs and, when enough time (known as the "propagation delay") has passed for the signals to propagate through the ALU circuitry, the result of the ALU operation appears at the ALU outputs. The external circuitry connected to the ALU is responsible for ensuring the stability of ALU input signals throughout the operation, and for allowing sufficient time for the signals to propagate through the ALU before sampling the ALU result.

#### 3. Working:

The architecture of design of high speed ALU circuit using adaptive logic For every circuit, DELAY is the common problem. To avoid that canary circuit or replica circuits are used to target the delay of the real critical path with some added margins. Drawback of canary or replica circuit Replica circuit is a collection of digital gates with some tune able delays. This is suitable for only small pipeline stages, when the stages number increases local variations at threshold and sub threshold voltage cause significant delay in replica path and actual path. To overcome the above problem TED circuit is introduced.

TED based system works more effective in largely removing the variation incurred in timing margins. This technique has the benefit of tracking real path delay that is not possible in replica circuits.TED system is EDS(Error-Detection-Sequential) circuit which generate error signals when the path setup fails. In this system, TEP designed with combination of TED with Time Borrowing(TB)Combining TED with TB into TEP (Time Error Prevention) system will be produced that can tolerate late coming signals that doesn't requires any additional circuitry. TED+TB = TEPwhich is more effective than conventional TED. In proposed concept EDS system circuitry is composed in several pipeline stages which are embedded parallel. Dual latch phase is considered to every stage at both input and output for the purpose of time borrowing. Output from any stage may be late again and time is borrowed from n+1 stage.



Figure2: 5Stage circuit



Available at <a href="https://pen2print.org/index.php/ijr/">https://pen2print.org/index.php/ijr/</a>

e-ISSN: 2348-6848 p-ISSN: 2348-795X Volume 05 Issue 23 December 2018



Figure3: Adaptive Logic Circuit

A combination of XOR gate with flip flop around every stage is added for verification of correction operation. Delay can be recovered by using TB technique. Time is borrowed from n+1 stage and that will be recovered at the output with out any delay. If the particular circuit is suffering from delay it automatically traces the delay and reduce the delay .Combination of XOR gate and flip flop around each stage is added for verification of correct operation .Due to above operation in every stage and the delay gets reduced by using that operation

#### 4. Results:



Figure4: simulation result of 32-bit ALU



connected to Adaptive circuit



Figure6: RTL schematic of ALU



Figure7: Technology Schematic View of ALU



Figure8 :RTL Schematic View of32-bit ALU connected to Adaptive circuit



Available at <a href="https://pen2print.org/index.php/ijr/">https://pen2print.org/index.php/ijr/</a>



Figure9: Technology Schematic View of 32-bit ALU connected to Adaptive circuit

### 4.1. Timing Details Of ALU:

Timing constraint: Default OFFSET OUT AFTER for Clock 'clk' Total number of paths / destination ports: 512 / 512 \_\_\_\_\_ -----Offset: 4.395ns (Levels of Logic = 1) Source: z 511 (LATCH) Destination: z < 511 > (PAD)Source Clock: clk falling Data Path: z 511 to z < 511 >Gate Net Cell:in->out fanout Delay Delay Logical Name (Net Name) 2 0.676 0.447 z\_511 (z\_511) LD:G->Q OBUF:I->O 3.272 z\_511\_OBUF (z<511>) \_\_\_\_\_ \_\_\_\_\_ Total 4.395ns (3.948ns logic, 0.447ns route) (89.8% logic, 10.2% route)

# 4.2Timing Details Of Adaptive Connected ALU

| Timing constraint: Default OFFSET OUT AFTER<br>for Clock 'clk'<br>Total number of paths / destination ports: 512 / 512 |  |  |  |
|------------------------------------------------------------------------------------------------------------------------|--|--|--|
| Total number of paths / destination ports. 512 / 512                                                                   |  |  |  |
|                                                                                                                        |  |  |  |
| Offset: $0.811$ ns (Levels of Logic = 1)                                                                               |  |  |  |
| Source: z_510_1 (LATCH)                                                                                                |  |  |  |
| Destination: $z < 511 > (PAD)$                                                                                         |  |  |  |
| Source Clock: clk falling                                                                                              |  |  |  |
| Data Path: z 510 1 to $z < 511 >$                                                                                      |  |  |  |
| Gate Net                                                                                                               |  |  |  |
| Cell:in->out fanout Delay Delay Logical Name                                                                           |  |  |  |
| (Net Name)                                                                                                             |  |  |  |
|                                                                                                                        |  |  |  |
| LDE:G->O 1 0.472 0.339 z 510 1                                                                                         |  |  |  |
| (z 510 1)                                                                                                              |  |  |  |
| OBUF: $I > 0$ 0.000 z 511 OBUF                                                                                         |  |  |  |
| (z<511>)                                                                                                               |  |  |  |
|                                                                                                                        |  |  |  |
| Total 0.811ns (0.472ns logic, 0.339ns                                                                                  |  |  |  |
| route) (58.2% logic.                                                                                                   |  |  |  |
| 41.8% route                                                                                                            |  |  |  |

#### Table1: Comparison of Delay

| Structure                                  |    | Delay   |
|--------------------------------------------|----|---------|
| 32-bit ALU                                 |    | 4.395ns |
| 32-bitALU<br>connected<br>adaptive circuit | to | 0.811ns |
|                                            |    |         |

The delay of the proposed method is decreased with compare to others. So the proposed method attained speed in the circuit.

#### **5.** Conclusion

Insufficient supply voltage to the circuit causes delay in the circuit and results in incorrect outputs. In order to supply sufficient voltage SC DC-DC converter is used. Delay is generally caused due to errors, to reduce those errors canary circuit is designed to reduce the errors in single shot without the usage of stages. But by using dual latch phase to the circuit the performance of the circuit is improved than the previous method. By using dual latch phase, canary circuit and a combination of X-OR gate with flip flop



around every stage is added for verification of correction operation results in improved output than the previous method Delay in 32bit ALU is 4.39ns.This delay is reduced by utilizing proposed Adaptive logic here we are achieving 0.811 ns delay with same output.

#### 6. References

- [1] IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 4, APRIL 2016 Implementing Minimum-Energy-Point Systems With Adaptive Logic LauriKoskinen, Member, IEEE, Markus Hiienkari, Student Member, IEEE,JaniMäkipää, Member, IEEE, and Matthew J. Turnquist, Student Member, IEEE.
- [2] N.H.E.Weste, D.M.Harris, "Cmos VLSI Design", 4th edition, Pearson.
- [3] K. Bernstein et al., "High-performance CMOS variability in the 65-nm regime and beyond," IBM J. Res. Develop., vol. 50, nos. 4–5, pp. 433–449, Jul./Sep. 2006.
- [4] K. J. Nowka et al., "A 32-bit PowerPC system-on-a-chip with support for dynamic voltage scaling and dynamic frequency scaling," IEEE J. Solid-State Circuits, vol. 37, no. 11, pp. 1441–1447, Nov. 2002.
- [5] A. Muhtaroglu, G. Talyor, and T. Rahal-Arabi, "On-die droop detector for analog sensing of power supply noise," IEEE J. Solid-State Circuits, vol. 39, no. 4, pp. 651–660, Apr. 2004.
- [6] M. Nakai et al., "Dynamic voltage and frequency management for a lowpower
- [7] embedded microprocessor," IEEE J. Solid-State Circuits, vol. 40, no. 1, pp. 28–35, Jan. 2005.
- [8] M. Agarwal, B. C. Paul, M. Zhang, and S. Mitra, "Circuit failure prediction and its application to transistor aging," in Proc. 25th IEEE VLSI Test Symp., May 2007, pp. 277–286.
- [9] J. Tschanz et al., "Adaptive frequency and biasing techniques for tolerance to dynamic temperature-voltage variations and aging," in IEEE Int. Solid-State Circuits Conf., Dig. Tech. Papers, Feb. 2007, pp. 292–293.
- [10] T. Fischer, J. Desai, B. Doyle, S. Naffziger, and B. Patella, "A 90-nm variable frequency clock system for a powermanaged itaniumarchitecture processor," IEEE J. Solid-State Circuits, vol. 41, no. 1, pp. 218–228, Jan. 2006.
- [11] A. Drake et al., "A distributed critical-path timing monitor for a 65 nm high-performance microprocessor," in IEEE Int. Solid-State Circuits Conf., Dig. Tech. Papers, Feb. 2007, pp. 398–399.
- [12] J. Tschanz, K. Bowman, S. Walstra, M. Agostinelli, T. Karnik, and V. De, "Tunable replica circuits and adaptive voltage-frequencytechniques for dynamic voltage, temperature, and agingvariation tolerance," in Proc. Symp. VLSI Circuits, Jun. 2009, pp. 112–113.
- [13] C. R. Lefurgy et al., "Active management of timing guardband to save energy in POWER7," in Proc. 44th Annu. IEEE/ACM Int. Symp
- [14] A. K. Uht, "Achieving typical delays in synchronous systems via timing error toleration," Dept. Elect. Comput. Eng., Univ. Rhode Island, Kingston, RI, USA, Tech. Rep. 032000-0100, 2000
- [15] D. Ernst et al., "Razor: A low-power pipeline based on circuitleveltiming speculation," in Proc. 36th Annu. IEEE/ACM Int. Symp.Microarchitecture, Dec. 2003, pp. 7– 18.
- [16] D. Ernst et al., "Razor: Circuit-level correction of timing errors for low-power operation," IEEE Micro, vol. 24, no. 6, pp. 10–20, Nov./Dec. 2004.

- [17] B. Greskamp and J. Torrellas, "Paceline: Improving singlethread performance in nanoscale CMPs through core overclocking," in Proc. 16th Int. Conf. Parallel Archit. Compilation Techn., Sep. 2007, pp. 213–224.
- [18] S. Das et al., "Razor II: In situ error detection and correction for PVT and SER tolerance," IEEE J. Solid-State Circuits, vol. 44, no. 1, pp. 32–48, Jan. 2009.
- [19] K. A. Bowman et al., "Energy-efficient and metastabilityimmune resilient circuits for dynamic variation tolerance," IEEE J. Solid-State Circuits, vol. 44, no. 1, pp. 49–63, Jan. 2009.
- [20] M. Fojtik et al., "Bubble Razor: An architecture-independent approach to timing-error detection and correction," in IEEE Int. Solid-State Circuits Conf., Dig. Tech. Papers, Feb. 2012, pp. 488–490.
- [21] N.Azemard and L.Svensson, "Integrated Circuit and System Design", In roc 17th International Workshop on Power and Timing modelling Optimization and Simulation (PATMOS), Sep. 2007
- [22] Ernst, D.; Nam Sung Kim; Das, S.; Pant, S.; Rao, R.; Pham, T.; Ziesler, C.; Blaauw, D.; Austin, T.; Flautner, K.Mudge, T., "Razor: a low-power pipeline based on circuit-level timing speculation, "In Proc 36th Annual IEEE/ACM International Symposium on Microarchitecture,(MICRO-36) , pp.7-18, Dec. 2003
- [23] https://en.wikipedia.org/wiki/Arithmetic\_logic\_unitALU
- [24] A. Wang, B. H. Calhoun, and A. P. Chandrakasan, Sub-Threshold Design for Ultra Low-Power Systems. New York, NY, USA: Springer-Verlag, 2005.
- [25] H.-P. Le, S. R. Sanders, and E. Alon, "Design techniques for fully
- [26] integrated switched-capacitor DC-DC converters," IEEE J. Solid-State