In a conventional ripple adder, the carry input for each full adder has to wait for the previous block to complete its execution, due to which the propagation delay increases. Since all the arithmetic operations are implemented by successive additions, the time consumed during addition process is critical.  There are several ways to reduce the carry propagation time in a parallel adder. The most widely used employs the principle of CARRY LOOKAHEAD logic.

Considering the circuit of the FULL ADDER, if we define two new binary variables

The output sum and carry can be represented as

So the carry outputs for the each stage is given by

Note that this circuit can add in less time since C3 does not have to wait for C2  and C1 to propagate. In fact C3 is propagated at the same time as C2 and C1.

So the circuit diagram for CARRY LOOKAHEAD GENERATOR IS

VERILOG CODE:

module CLA(

input [3:0] a,b,   //INPUTS TO BE ADDED

input c0,  // 0 for addition, 1 for subtraction

output [3:0] s,  // SUM OUTPUT

output c4);     //OUTPUT CARRY

wire c1,c2,c3;

wire [3:0] p,g;

// Using the above equations

xor (p[0],a[0],b[0]);

xor (p[1],a[1],b[1]);

xor (p[2],a[2],b[2]);

xor (p[3],a[3],b[3]);

and (g[0],a[0],b[0]);

and (g[1],a[1],b[1]);

and (g[2],a[2],b[2]);

and (g[3],a[3],b[3]);

assign c1 = g[0]||(p[0]&&c0);

assign c2 = g[1]||(p[1]&&g[0])||(p[1]&&p[0]&&c0);

assign c3 = p[2]&&p[1]&&p[0]&&c0;

xor (s[0],p[0],c0);

xor (s[1],p[1],c1);

xor (s[2],p[2],c2);

xor (s[3],p[3],c3);

assign c4=g[3]||(p[3]&&c3);

endmodule

So the VERILOG code is the exact replica of all the expressions discussed in the theory part of the CARRY LOOKAHEAD GENERATOR.

RTL SCHEMATIC

So as can be seen from the RTL schematic, the output c4 is the operations on c1,a and b which are available at all the times, i.e. it does not have to wait for the c2 and c3 to propagate.

TEST BENCH:

initial begin

// Initialize Inputs

a = 4’b0101;

b = 4’b1010;

c0 = 0;

// Wait 100 ns for global reset to finish

#100;

a = 4’b0101;

b = 4’b0010;

c0 = 0;

// Wait 100 ns for global reset to finish

#100;

End

OUTPUT:

DELAY ANALYSIS:

From the synthesis report, it can be seen that the logic delay of a CLA is only 7.061 ns which is far less as compared to a normal ripple adder.