Mixed PTL/Static Logic Synthesis
Project
PageStatic CMOS logic style has long been used to realize a VLSI system because of ease to use and well developed synthesis. With power being increasingly a limiting factor in high
density and high-performance VLSI designs, a great deal of effort have been made to explore low-power design options without
sacrificing performance.
At the circuit level, mixing PTL with static CMOS has been proposed[1,2,3,4] as an alternative low-power circuit style. Unlike conventional PTL
design where dedicated buffers are inserted in pass-transistor
trees to restore driving strength, mixed PTL allocates certain number of static gates at optimal locations within pass-transistor
trees to boost driving strength as well as performing logic
functions. Consequently, circuit performance and power consumption of mixed PTL circuits are further improved over the conventional
PTL circuits.
Designing mixed PTL circuits for low-power and high-performance depends on two tasks: selection of PTL cells and a synthesis
technique to produce a mixed PTL structure. The choice of PTL
cells directly impacts the Boolean matching process in the synthesis phase and has a significant impact on overall quality of
the final synthesized results. In [2,3], nMOS-only and pMOS-only pass transistor trees are used in PTL
cells to reduce cell size. The full rail-to-rail swing of the output signal is restored by the extra level restoring logic at
the output of a PTL gate. The existence of level-restoring logic at the output of PTL gates not only slows down the PTL gates due
to potential drive-fights, but also increases their power consumption. A greedy search algorithm was proposed in
[2] to determine the best mappings for PTL cells and static cells. PTL cells are only used for implementing MUX and
XOR/XNOR type logic functions and static gates are used to implement all the remaining logic functions. The techniques
proposed in [3] use dynamic programming to map more complex Boolean sub-functions to PTL gates. The search for optimal
solutions in [3] was applied only within a sub-tree of a given circuit represented by a DAG. Sub-trees are separated by
multiple fanout points in the DAG. The cell interactions between sub-trees were not considered and no attempt was made to optimize
the overall circuit at the global level. In addition, their results did not specify the technology used to get the area and
performance data.
We present a single-rail and dual-rail mixed PTL/Static logic synthesis method and compare the results of the
synthesized single-rail and dual-rail mixed PTL circuits to those of conventional static CMOS circuits synthesized using a
commercial logic synthesis tool from Synopsys in an experimental 0.1um
technology. For
example, the experimental results of single-rail and dual-rail mixed PTL/Static on ISCAS85 benchmark circuits using the proposed
method in 0.1um bulk CMOS technology are 73% and 50% better than their conventional static CMOS counterparts in power
consumption with performance gain of 5% and 10%, respectively.
2. Mixed PTL/Static CMOS Logic Synthesis Flow

3. GA Covering Using Genetic Algorithms
The existing optimization techniques using dynamic
programming [3,5] for covering are sub-optimal because the optimization is only restricted to within a sub-tree
separated by multiple fanout points of a given circuit. The cell interactions between sub-trees were not considered and no attempt
was made to optimize the overall circuit at the global level. In [7], a covering method that can be applied to DAG has been proposed. However, this method has a negative impact on
power consumption because of gates duplication during the DAG covering process. To overcome this drawback, we use the genetic
algorithms (GA) to select the best mapped circuit at global level from the matches generated in the
matching stage.
3.1 Overview of the proposed covering algorithm
3.2 Crossover Operation
The main purpose of the crossover operation is to introduce new mapped solutions that differ from the solutions in the current
sub-population so that the cost of the children moves toward the desirable direction.
In this proposed synthesis flow, the objective is to minimize the cost defined as the total transistor area under delay constraints.
The crossover operation is applied to two parents
(sub-chromosomes), parent1 and parent2, of a sub-population, and these parents are randomly selected from the sub-population in
such a way that the sub-chromosomes with lower cost are most
likely to be selected. The reason for applying crossover to the sub-chromosomes is that
the entire circuit was decomposed based on multiple fanout points. Two new sub-chromosomes that are created
after applying crossover are called children, child1 and child2.
Figure 3 shows the concept of the crossover operation. Figures 3 (a) and (b) are the two parents, parent1 and parent2, respectively. A crossover point,
cp, for each parent must be selected before the crossover operation. The crossover point must
be the same in each parent; otherwise, the validity of each child after crossover cannot be guaranteed because nodes around
cp can be duplicated. Once the crossover point is determined, the crossover operation is
performed by exchanging the fanin cones rooted at cp of parent1 and
parent2, as shown in Figure 3 (c) and (d). Any invalid child chromosome according to the covering rules will be
discarded. Before deciding the validity of the children, the merging (also called re-mapping) process is performed by checking
the two nodes around crossover point cp. For example, as shown in Figures 3 (c) and (d), the static (PTL) nodes s1 (p1) and s2
(p2) can be merged if matches at node s1 (p1) contain a static (PTL) match that
covers both node s1 (p1) and node s2 (p2). Figures 3 (e) and (f) show the children after merging the nodes around the crossover point
cp
(not shown in the figures). The merged nodes for static and PTL in the figure are
sm and pm, respectively. This merge process of two different cells into one cell positively impacts power and
performance because the effective area is reduced.
3.2 Mutation Operation
Mutation is another way to introduce diversity into the population. After several generations of a population, it is possible that the crossover will drive some genes in a certain position on chromosomes to have the same kind of static or PTL matches. In this case, then, reintroduce alternative solutions. The mutation operator, therefore, is used to reintroduce alternative solutions and is used as a hill-climbing mechanism.
4. Experimental Results
4.2. Experimental Setup
The Boolean matching and GA mapping algorithm presented in this paper were implemented in C++ on a Pentium III Linux platform.
Static CMOS, single-rail, and dual-rail mixed PTL/CMOS circuits
for all the ISCAS85 benchmark circuits and another benchmark circuits from a 64-bit
microprocessor design were synthesized. All PTL cells were locally created using a
0.1mm CMOS process. The static cells used in this work were from a commercial cell library
in all three processes. The supply voltage in our experiments was set to
1.1V for 0.1mm CMOS technology. We used CUDD
package to build BDDs for the matching stage. In order to determine the benefits of using mixed PTL circuits, we
compared our results to that of static circuits using a commercial logic synthesis tool from Synopsys in terms of delay, size, and
power consumption.
4.2. Experimental Results
Figure 5 (a) shows that the single-rail and dual-rail mixed PTL styles of the ISCAS85 benchmark circuits
using the proposed method are 73% and 50% better than their static counterparts in power consumption with performance gain of
5% and 10%, respectively. The area and power overhead of
dual-rail circuits over single-rail circuits were 82% and 84%, respectively. However, the area of single-rail and dual-rail
circuits using proposed method were 75% and 55% better than that of static counterparts, respectively.
Figure 5(b) shows that the delay of the single-rail and the dual-rail mixed PTL styles of the benchmark
circuits from the microprocessor are 11% and 35% better than that their static counterparts, respectively. Power savings of the
single-rail and dual-rail mixed PTL circuits over their static counterparts were up to
70% and 64%, respectively. The power-delay-product (PDP) of single-rail and dual-rail mixed PTL circuits using the proposed
synthesis method are also significantly better than that of their static counterparts.
5. Conclusions
We presented a single-rail and dual-rail mixed PTL/CMOS synthesis method and compared the results of single-rail and dual-rail
circuits using the proposed synthesis method with their conventional static CMOS counterparts in
0.1mm CMOS technology. The experimental results also confirms that the mixed
PTL/Static logic using the proposed synthesis method is a promising alternative to conventional static CMOS circuits for
future high-performance and low-power VLSI designs.
References
[1] S. Yamashita, K. Yano, and et. al, "Pass-Transistor/CMOS Collaborated Logic: The Best of Both Worlds," in Symposium on VLSI Circuits Digest of Technical Papers, pp. 31–32, 1997.
[2] C. Yang and M. Ciesielski, "Synthesis For Mixed CMOS/PTL Logic : Preliminary Results," in International Workshop onLogic Synthesis, (Lake Tahoe, CA), 1999.
[3] Y. Jiang, S. S. Sapatneker, and C. Bamji, "Technology Mapping for High Performance Static CMOS and Pass Transistor Logic Designs," IEEE Trans. on Very Large Scale Intergration (VLSI) Systems, vol. 9, pp. 577–589, Oct. 2001.
[4] G. R. Cho and T. Chen, "On Mixed PTL/Static Logic for Low-Power and High-Speed Circuits," VLSI Design : An International Journal of Custom-Chip Design, Simulation, and Testing, vol. 12, no. 3, pp. 399–406, 2001.
[5] R. L. Rudell, Logic Synthesis for VLSI Design. Ph.D. dissertation, University of California, Berkeley, CA 94720, 1989.
[6] G. D. Micheli, Synthesis and Optimization of Digital Circuits. McGraw-Hill, Inc., 1994.
[7] Y. Kukimoto, R. K. Brayton, and P. Sawkar, "Delay-Optimal Technology Mapping by DAG Covering," in Design Automation Conference, (Fan Fransisco, CA), pp. 348–351, 1998.