Walter Scott, Jr. College of Engineering

Graduate Exam Abstract

Mozammel Hossain
Ph.D. Final
Mar 23, 2017, 2:00 pm - 4:00 pm
Abstract: Recent changes in the technology market demand faster turnaround of the design, and as a result, designers struggle to meet performance requirements under prohibitively expensive non-recurring engineering (NRE) costs. Increasing costs for the design, validation, and time to market are some of the most pressing issues for next generation microprocessor systems. Custom versus synthesis VLSI circuit design has been a lively debate for last decade. However, the ever-increasing cost of designs in large-scale projects is becoming a bottleneck to the industry, thus, the wind is blowing strongly in favor of synthesis, especially when the modern synthesis engines are becoming more and more sophisticated in closing timing, completing routings, avoiding congestions and better interface with all the backend tools. Custom design advocates will argue that the last pico-second and milli-watts cannot be left on the table, but the reality is, they need to get over this mentality in favor of cost, time to market, and the ability to adopt the last minute-change of the design. As the industry moves into the next generation of microprocessor design, it faces growing complexity in device scaling, supporting algorithm, and time to market. The need for customization has never been greater to address specific needs for customers. Building a complex microprocessor system is complicated and time consuming. Thus, circuit designers need to face the reality and move forward for more automation in the design cycles. While the past generations of microprocessors had more custom circuit design to meet tighter cycle time battle, more startups and IC companies are trying to design by moving towards common synthesizable design methodology, and in most cases, sacrificing the desired speed in favor of new functionality. Needless to say, improving the synthesis methodology as an alternative to the custom circuit design is gaining high momentum in the industry because the custom design most often needs hand crafted schematics and layout design which are overwhelmingly time consuming requiring as high as 2-4 times development time to the synthesis. However, the key point is to stay within a similar area and power budget, and yet cut significant time to development cycle of the design. Study shows, with the help of advanced Electronic Design Automation (EDA) tool capabilities i.e. advanced algorithm for timing, power and congestion-mitigation, the synthesis-based-design is increasing by about 30% relative to its predecessor.
With the goal of mitigating the rising cost of designing custom macros and eliminating the limitation of the usages of smaller Random Logic Macros (RLM), in this dissertation, it is intended to work on a methodology and algorithm development that will enable the industry to synthesize custom and RLM macros into bigger macro, Multi-Million Logic Gate Synthesis (MMLGS), to improve the Physical Design (PD) resource efficiency. Additionally, MMLGS methodology most often requires embedded IP or custom components, which may require different power supply rail(s) and (or) sync-async clocking interface(s). One of the popular timing methodologies is to share positive slack across the latches to ease the timing burden from one cycle to another, thereby closing time-challenged and architecturally critical paths. Similarly, multi-power supplies are used in design to either dynamically control part of the circuits or statically isolate a portion of the logic to different power islands to control the total power consumptions of the chip. In a custom circuit design, designers carefully hand craft gates to support these multi-clock and multi-power domains, satisfying the timing and methodology requirements. However, in a high-speed synthesis design environment, designers struggle to make sure multi-clock and multi-power interfaces are designed, placed, connected, and timed correctly. Identifying and applying the proper timing constraints such as “no cycle stealing” at synchronous and asynchronous (sync-async) domain interfaces in synthesis, unit, and chip timing, are of the essence. Even though some of these concepts are available now and were used in the past but they have very limited application to the custom design methodology only.
As the synthesis methodology for MMLGS is being developed, work in this dissertation will also include developing algorithms and design methodologies for multi-power and multi-clock domains to take advantage of slack sharing across all hierarchies of the chip design. The developed algorithm and methodology will: 1) improve physical design resource efficiency; 2) improve performance by sharing unused slacks efficiently at appropriate design hierarchies; 3) improve logic optimization by collapsing smaller macro boundaries; and 4) enable correct placement and connectivity of logic-gates in multi-power domain. Upon successful implementation of the design methodology approach for MMLGS with multi-power and multi-clock, these designs need to be functionally verified for different input stimuli at different frequencies. At the same time, the design must comply with all backend tools such as noise, electro-migration, timing, and of course all PD verification tools such as DRC, LVS, and YLD etc. to demonstrate electrical and physical feasibility of the design.
Overall, a synthesis based physical design methodology using soft hierarchy, interior pin placement, pre-placing critical logic, post-routing techniques, has been developed and proposed in this dissertation. The effectiveness of the proposed design methodology is illustrated using a very time and area challenged unit, the level-2 cache (L2 cache) unit. Additional synthesis based physical design algorithms and methodologies have been developed for level translators at multi-voltage domains and slack sharing at sync-async interface paths at all level of PD hierarchies. Proposed design methodologies show how slack sharing can be an advantage to ease timing, and yet avoid potential meta-stability in the circuit. The developed methodology can save significant number of physical design resources because millions of gates can be packed in a synthesizable macro even in multi-clock and power domain instead of designing those macros individually in high-cost design flow. By adopting the proposed MMLGS methodology along with multi- voltage and clock approach in synthesis and timing methodology, development costs can be cut by about 50%, which is substantially significant PD resource savings in high-cost VLSI circuit design.
Adviser: Prof. Tom Chen
Co-Adviser: N/A
Non-ECE Member: Prof. Yashwant Malaiya
Member 3: Dr. Sudeep Pasricha
Addional Members: Dr. Ali Pezeshki
• Mozammel H., John B., Jack D., Tom C., “A Practical Automated Timing and Physical Design Implementation Methodology for the Synchronous Asynchronous Interface and Multi-Voltage Domain in High-Speed Synthesis”, Microprocessors and Microsystems, vol 45, pp. 241-252, August 2016.

• Mozammel H., Chirag D., Tom C., Vikas A., “Synthesis Based Design and Implementation Methodology of High Speed, High Performing Unit: L2 Cache Unit Design”, Integration the VLSI Journals, vol 49, pp. 125-136, March 2015.

• J. Friedrich, R. Puri, U. Brandt, M. Buehler, J. DiLulo, J. Hopkins, M. Hossain, M. Kazda, J. Keinert, Z. Kurzum, D. Lamb, A. Lee, F. Musante, J. Noak, P. Osler, S. Posluszny, H. Qian, R. Ramji, V. Rao, L. Reddy, H. Ren, H. Ren, T. Rosser, B. Russell, C. Sze, G. Telleze, “Design methodology for the IBM Power7 microprocessor,” IBM Journal of Research and Development, Vol 55, Issue 3, pages 294-307, 2011.

• Mozammel H., Eric F., Allen H., Vikas A., “Physical Design and Implementation of POWER8™ (P8) Server Class Processor”, “Midwest Symposium on Circuits and Systems (MWSCAS)”, August 2-5, 2015.

• John B. Mozammel H., John I., “Wire Buffer Design for High Speed Microprocessors”, Austin Conference on Integrated Systems & Circuits (ACISC), May 16-18, 2007.

• Tony S., Ching L., Mozammel H., “A Practical Verification Strategy for ASIC Embedded Memory Behavioral Model”, The Fifth Asia Pacific Conference on Hardware Description Languages (APCHDL), July 8-10, 1998.

Program of Study:
EE 660