Engineering Electrical and Electronic Engineering

Low-power high-performance VLSI design

Description

This cluster of papers focuses on the design and optimization of low-power VLSI circuits, with an emphasis on techniques such as approximate computing, subthreshold operation, and leakage reduction. The papers also address challenges related to process variation, statistical timing analysis, and energy efficiency in CMOS technology at the nanometer scale.

Keywords

Low-Power; VLSI Circuits; Approximate Computing; Subthreshold Operation; Leakage Reduction; Process Variation; Statistical Timing Analysis; Energy Efficiency; CMOS Technology; Nanometer Scale

From the Publisher: This book covers the design of next generation microprocessors in deep submicron CMOS technologies. The chapters in Design of High Performance Microprocessor Circuits were written by some … From the Publisher: This book covers the design of next generation microprocessors in deep submicron CMOS technologies. The chapters in Design of High Performance Microprocessor Circuits were written by some of the world’s leading technologists, designers, and researchers. All levels of system abstraction are covered, but the emphasis rests squarely on circuit design. Examples are drawn from processors designed at AMD, Digital/Compaq, IBM, Intel, MIPS, Mitsubishi, and Motorola. Each topic of this invaluable reference stands alone so the chapters can be read in any order. The following topics are covered in depth: Architectural constraints of CMOS VLSI design Technology scaling, low-power devices, SOI, and process variations Contemporary design styles including a survey of logic families, robust dynamic circuits, asynchronous logic, self-timed pipelines, and fast arithmetic units Latches, clocks and clock distribution, phase-locked and delay-locked loops Register file, cache memory, and embedded DRAM design High-speed signaling techniques and I/O design ESD, electromigration, and hot-carrier reliability CAD tools, including timing verification and the analysis of power distribution schemes Test and testability Design of High-Performance Microprocessor Circuits assumes a basic knowledge of digital circuit design and device operation, and covers a broad range of circuit styles and VLSI design techniques. Packed with practical know-how, it is an indispensable reference for practicing circuit designers, architects, system designers, CAD tool developers, process technologists, and researchers. It is also an essential text for VLSI design courses.
Motivated by emerging battery-operated applications that demand intensive computation in portable environments, techniques are investigated which reduce power consumption in CMOS digital circuits while maintaining computational throughput. Techniques for low-power … Motivated by emerging battery-operated applications that demand intensive computation in portable environments, techniques are investigated which reduce power consumption in CMOS digital circuits while maintaining computational throughput. Techniques for low-power operation are shown which use the lowest possible supply voltage coupled with architectural, logic style, circuit, and technology optimizations. An architecturally based scaling strategy is presented which indicates that the optimum voltage is much lower than that determined by other scaling considerations. This optimum is achieved by trading increased silicon area for reduced power consumption. >
The stability of both resistor-load (R-load) and full-CMOS SRAM cells is investigated analytically as well as by simulation. Explicit analytic expressions for the static-noise margin (SNM) as a function of … The stability of both resistor-load (R-load) and full-CMOS SRAM cells is investigated analytically as well as by simulation. Explicit analytic expressions for the static-noise margin (SNM) as a function of device parameters and supply voltage are derived. The expressions are useful in predicting the effect of parameter changes on the stability as well as in optimizing the design of SRAM cells. An easy-to-use SNM simulation method is presented, the results of which are in good agreement with the results predicted by the analytic SNM expressions. It is further concluded that full-CMOS cells are much more stable than R-local cells at a low supply voltage.
In emerging embedded applications such as wireless sensor networks, the key metric is minimizing energy dissipation rather than processor speed. Minimum energy analysis of CMOS circuits estimates the optimal operating … In emerging embedded applications such as wireless sensor networks, the key metric is minimizing energy dissipation rather than processor speed. Minimum energy analysis of CMOS circuits estimates the optimal operating point of clock frequencies, supply voltage, and threshold voltage according to A. Chandrakasan et al. (see ibid., vol.27, no.4, p.473-84, Apr. 1992). The minimum energy analysis shows that the optimal power supply typically occurs in subthreshold (e.g., supply voltages that are below device thresholds). New subthreshold logic and memory design methodologies are developed and demonstrated on a fast Fourier transform (FFT) processor. The FFT processor uses an energy-aware architecture that allows for variable FFT length (128-1024 point), variable bit-precision (8 b and 16 b) and is designed to investigate the estimated minimum energy point. The FFT processor is fabricated using a standard 0.18-/spl mu/m CMOS logic process and operates down to 180 mV. The minimum energy point for the 16-b 1024-point FFT processor occurs at 350-mV supply voltage where it dissipates 155 nJ/FFT at a clock frequency of 10 kHz.
Approximate computing has recently emerged as a promising approach to energy-efficient design of digital systems. Approximate computing relies on the ability of many systems and applications to tolerate some loss … Approximate computing has recently emerged as a promising approach to energy-efficient design of digital systems. Approximate computing relies on the ability of many systems and applications to tolerate some loss of quality or optimality in the computed result. By relaxing the need for fully precise or completely deterministic operations, approximate computing techniques allow substantially improved energy efficiency. This paper reviews recent progress in the area, including design of approximate arithmetic blocks, pertinent error and quality measures, and algorithm-level techniques for approximate computing.
Multiple-valued logic, in which the number of discrete logic levels is not confined to two, has been the subject of much research over many years. The practical objective of this … Multiple-valued logic, in which the number of discrete logic levels is not confined to two, has been the subject of much research over many years. The practical objective of this work has been to increase the information content of the digital signals in a system to a higher value than that provided by binary operation. In this tutorial/survey paper we will review the historical developments in this field, both in circuit realizations and in methods of handling multiple-valued design data, and consider the present state-of-the-art and future expectations.
This paper presents the many-core architecture, with hundreds to thousands of small cores, to deliver unprecedented compute performance in an affordable power envelope. We discuss fine grain power management, memory … This paper presents the many-core architecture, with hundreds to thousands of small cores, to deliver unprecedented compute performance in an affordable power envelope. We discuss fine grain power management, memory bandwidth, on die networks, and system resiliency for the many-core system.
THE SEARCH for simple abstract techniques to be applied to the design of switching systems is still, despite some recent advances, in its early stages. The problem in this area … THE SEARCH for simple abstract techniques to be applied to the design of switching systems is still, despite some recent advances, in its early stages. The problem in this area which has been attacked most energetically is that of the synthesis of efficient combinational that is, nonsequential, logic circuits.
A simple formula is derived for quick calculation of the maximum short-circuit dissipation of static CMOS circuits. A detailed discussion of this short-circuit dissipation is given based on the behavior … A simple formula is derived for quick calculation of the maximum short-circuit dissipation of static CMOS circuits. A detailed discussion of this short-circuit dissipation is given based on the behavior of the inverter when loaded with different capacitances. It was found that if each inverter of a string is designed in such a way that the input and output rise and fall times are equal, the short-circuit dissipation will be much less than the dynamic dissipation (<20%). This result has been applied to a practical design of a CMOS driving circuit (buffer), which is commonly built up of a string of inverters. An expression has also been derived for a tapering factor between two successive inverters of such a string to minimize parasitic power dissipation. Finally, it is concluded that optimization in terms of power dissipation leads to a better overall performance (in terms of speed, power, and area) than is possible by minimization of the propagation delay.
An interconnection pattern of processing elements, the cube-connected cycles (CCC), is introduced which can be used as a general purpose parallel processor. Because its design complies with present technological constraints, … An interconnection pattern of processing elements, the cube-connected cycles (CCC), is introduced which can be used as a general purpose parallel processor. Because its design complies with present technological constraints, the CCC can also be used in the layout of many specialized large scale integrated circuits (VLSI). By combining the principles of parallelism and pipelining, the CCC can emulate the cube-connected machine and the shuffle-exchange network with no significant degradation of performance but with a more compact structure. We describe in detail how to program the CCC for efficiently solving a large class of problems that include Fast Fourier transform, sorting, permutations, and derived algorithms.
In this paper we present a new data structure for representing Boolean functions and an associated set of manipulation algorithms. Functions are represented by directed, acyclic graphs in a manner … In this paper we present a new data structure for representing Boolean functions and an associated set of manipulation algorithms. Functions are represented by directed, acyclic graphs in a manner similar to the representations introduced by Lee [1] and Akers [2], but with further restrictions on the ordering of decision variables in the graph. Although a function requires, in the worst case, a graph of size exponential in the number of arguments, many of the functions encountered in typical applications have a more reasonable representation. Our algorithms have time complexity proportional to the sizes of the graphs being operated on, and hence are quite efficient as long as the graphs do not grow too large. We present experimental results from applying these algorithms to problems in logic design verification that demonstrate the practicality of our approach.
With the advent of portable and high-density microelectronic devices, the power dissipation of very large scale integrated (VLSI) circuits is becoming a critical concern. Accurate and efficient power estimation during … With the advent of portable and high-density microelectronic devices, the power dissipation of very large scale integrated (VLSI) circuits is becoming a critical concern. Accurate and efficient power estimation during the design phase is required in order to meet the power specifications without a costly redesign process. In this paper, we present a review of the power estimation techniques that have recently been proposed.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
An approach is presented for minimizing power consumption for digital systems implemented in CMOS which involves optimization at all levels of the design. This optimization includes the technology used to … An approach is presented for minimizing power consumption for digital systems implemented in CMOS which involves optimization at all levels of the design. This optimization includes the technology used to implement the digital circuits, the circuit style and topology, the architecture for implementing the circuits and at the highest level the algorithms that are being implemented. The most important technology consideration is the threshold voltage and its control which allows the reduction of supply voltage without significant impact on logic speed. Even further supply reductions can be made by the use of an architecture-based voltage scaling strategy, which uses parallelism and pipelining, to tradeoff silicon area and power reduction. Since energy is only consumed when capacitance is being switched power can be reduced by minimizing this capacitance through operation reduction choice of number representation, exploitation of signal correlations, resynchronization to minimize glitching, logic design, circuit design, and physical design. The low-power techniques that are presented have been applied to the design of a chipset for a portable multimedia terminal that supports pen input, speech I/O and full-motion video. The entire chipset that performs protocol conversion, synchronization, error correction, packetization, buffering, video decompression and D/A conversion operates from a 1.1 V supply and consumes less than 5 mW.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
A closed-form formula for a waveform of the RC interconnection line with practical boundary conditions is derived. Expressions are also derived for the voltage slope and transition time of the … A closed-form formula for a waveform of the RC interconnection line with practical boundary conditions is derived. Expressions are also derived for the voltage slope and transition time of the RC interconnection and for coupling capacitance and crosstalk voltage height, which can be used in VLSI designs. Using the expressions, the optimum linewidth that minimizes RC delay and the trend of RC delay in the scaled-down VLSIs are discussed.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
In this paper, we propose a set of rules for consistent estimation of the real performance and power features of the flip-flop and master-slave latch structures. A new simulation and … In this paper, we propose a set of rules for consistent estimation of the real performance and power features of the flip-flop and master-slave latch structures. A new simulation and optimization approach is presented, targeting both high-performance and power budget issues. The analysis approach reveals the sources of performance and power-consumption bottlenecks in different design styles. Certain misleading parameters have been properly modified and weighted to reflect the real properties of the compared structures. Furthermore, the results of the comparison of representative master-slave latches and flip-flops illustrate the advantages of our approach and the suitability of different design styles for high-performance and low-power applications.
Off-state leakage is static power, current that leaks through transistors even when they are turned off. The other source of power dissipation in today's microprocessors, dynamic power, arises from the … Off-state leakage is static power, current that leaks through transistors even when they are turned off. The other source of power dissipation in today's microprocessors, dynamic power, arises from the repeated capacitance charge and discharge on the output of the hundreds of millions of gates in today's chips. Until recently, only dynamic power has been a significant source of power consumption, and Moore's law helped control it. However, power consumption has now become a primary microprocessor design constraint; one that researchers in both industry and academia will struggle to overcome in the next few years. Microprocessor design has traditionally focused on dynamic power consumption as a limiting factor in system integration. As feature sizes shrink below 0.1 micron, static power is posing new low-power design challenges.
An improved voltage multiplier technique has been developed for generating +40 V internally in p-channel MNOS integrated circuits to enable them to be operated from standard +5- and -12-V supply … An improved voltage multiplier technique has been developed for generating +40 V internally in p-channel MNOS integrated circuits to enable them to be operated from standard +5- and -12-V supply rails. With this technique, the multiplication efficiency and current driving capability are both independent of the number of multiplier stages. A mathematical model and simple equivalent circuit have been developed for the multiplier and the predicted performance agrees well with measured results. A multiplier has already been incorporated into a TTL compatible nonvolatile quad-latch, in which it occupies a chip area of 600 /spl mu/m/spl times/240 /spl mu/m. It is operated with a clock frequency of 1 MHz and can supply a maximum load current of about 10 /spl mu/A. The output impedance is 3.2 M/spl Omega/.
1-V power supply high-speed low-power digital circuit technology with 0.5-/spl mu/m multithreshold-voltage CMOS (MTCMOS) is proposed. This technology features both low-threshold voltage and high-threshold voltage MOSFET's in a single LSI. … 1-V power supply high-speed low-power digital circuit technology with 0.5-/spl mu/m multithreshold-voltage CMOS (MTCMOS) is proposed. This technology features both low-threshold voltage and high-threshold voltage MOSFET's in a single LSI. The low-threshold voltage MOSFET's enhance speed performance at a low supply voltage of 1 V or less, while the high-threshold voltage MOSFET's suppress the stand-by leakage current during the sleep period. This technology has brought about logic gate characteristics of a 1.7-ns propagation delay time and 0.3-/spl mu/W/MHz/gate power dissipation with a standard load. In addition, an MTCMOS standard cell library has been developed so that conventional CAD tools can be used to lay out low-voltage LSI's. To demonstrate MTCMOS's effectiveness, a PLL LSI based on standard cells was designed as a carrying vehicle. 18-MHz operation at 1 V was achieved using a 0.5-/spl mu/m CMOS process.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
Portable, embedded systems place ever-increasing demands on high-performance, low-power microprocessor design. Dynamic voltage and frequency scaling (DVFS) is a well-known technique to reduce energy in digital systems, but the effectiveness … Portable, embedded systems place ever-increasing demands on high-performance, low-power microprocessor design. Dynamic voltage and frequency scaling (DVFS) is a well-known technique to reduce energy in digital systems, but the effectiveness of DVFS is hampered by slow voltage transitions that occur on the order of tens of microseconds. In addition, the recent trend towards chip-multiprocessors (CMP) executing multi-threaded workloads with heterogeneous behavior motivates the need for per-core DVFS control mechanisms. Voltage regulators that are integrated onto the same chip as the microprocessor core provide the benefit of both nanosecond-scale voltage switching and per-core voltage control. We show that these characteristics provide significant energy-saving opportunities compared to traditional off-chip regulators. However, the implementation of on-chip regulators presents many challenges including regulator efficiency and output voltage transient characteristics, which are significantly impacted by the system-level application of the regulator. In this paper, we describe and model these costs, and perform a comprehensive analysis of a CMP system with on-chip integrated regulators. We conclude that on-chip regulators can significantly improve DVFS effectiveness and lead to overall system energy savings in a CMP, but architects must carefully account for overheads and costs when designing next-generation DVFS systems and algorithms.
This paper describes an algorithm for generating provably passive reduced-order N-port models for RLC interconnect circuits. It is demonstrated that, in addition to macromodel stability, macromodel passivity is needed to … This paper describes an algorithm for generating provably passive reduced-order N-port models for RLC interconnect circuits. It is demonstrated that, in addition to macromodel stability, macromodel passivity is needed to guarantee the overall circuit stability once the active and passive driver/load models are connected. The approach proposed here, PRIMA, is a general method for obtaining passive reduced-order macromodels for linear RLC systems. In this paper, PRIMA is demonstrated in terms of a simple implementation which extends the block Arnoldi technique to include guaranteed passivity while providing superior accuracy. While the same passivity extension is not possible for MPVL, comparable accuracy in the frequency domain for all examples is observed.
As technology scales, variability in transistor performance continues to increase, making transistors less and less reliable. This creates several challenges in building reliable systems, from the unpredictability of delay to … As technology scales, variability in transistor performance continues to increase, making transistors less and less reliable. This creates several challenges in building reliable systems, from the unpredictability of delay to increasing leakage current. Finding solutions to these challenges require a concerted effort on the part of all the players in a system design. This article discusses these effects and proposes microarchitecture, circuit, and testing research that focuses on designing with many unreliable components (transistors) to yield reliable system designs.
In this paper, we introduce PVL, an algorithm for computing the Pade approximation of Laplace-domain transfer functions of large linear networks via a Lanczos process. The PVL algorithm has significantly … In this paper, we introduce PVL, an algorithm for computing the Pade approximation of Laplace-domain transfer functions of large linear networks via a Lanczos process. The PVL algorithm has significantly superior numerical stability, while retaining the same efficiency as algorithms that compute the Pade approximation directly through moment matching, such as AWE and its derivatives. As a consequence, it produces more accurate and higher-order approximations, and it renders unnecessary many of the heuristics that AWE and its derivatives had to employ. The algorithm also computes an error bound that permits to identify the true poles and zeros of the original network. We present results of numerical experiments with the PVL algorithm for several large examples.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
A microprocessor system is presented in which the supply voltage and clock frequency can be dynamically varied so that the system can deliver high throughput when required while significantly extending … A microprocessor system is presented in which the supply voltage and clock frequency can be dynamically varied so that the system can deliver high throughput when required while significantly extending battery life during the low speed periods. The system consists of a dc-dc switching regulator, an ARM V4 microprocessor with a 16-kB cache, a bank of 64-kB SRAM ICs, and an I/O interface IC. The four custom chips were fabricated in a standard 0.6-/spl mu/m 3-metal CMOS process. The system can dynamically vary the supply voltage from 1.2 to 3.8 V in less than 70 /spl mu/s. This provides a throughput range of 6-85 MIPS with an energy consumption of 0.54-5.6 mW/MIP yielding an effective energy efficiency as high as 26200 MIPS/W.
Recently reported logic style comparisons based on full-adder circuits claimed complementary pass-transistor logic (CPL) to be much more power-efficient than complementary CMOS. However, new comparisons performed on more efficient CMOS … Recently reported logic style comparisons based on full-adder circuits claimed complementary pass-transistor logic (CPL) to be much more power-efficient than complementary CMOS. However, new comparisons performed on more efficient CMOS circuit realizations and a wider range of different logic cells, as well as the use of realistic circuit arrangements demonstrate CMOS to be superior to CPL in most cases with respect to speed, area, power dissipation, and power-delay products. An implemented 32-b adder using complementary CMOS has a power-delay product of less than half that of the CPL version. Robustness with respect to voltage scaling and transistor sizing, as well as generality and ease-of-use, are additional advantages of CMOS logic gates, especially when cell-based design and logic synthesis are targeted. This paper shows that complementary CMOS is the logic style of choice for the implementation of arbitrary combinational circuits if low voltage, low power, and small power-delay products are of concern.
This paper investigates the effect of lowering the supply and threshold voltages on the energy efficiency of CMOS circuits. Using a first-order model of the energy and delay of a … This paper investigates the effect of lowering the supply and threshold voltages on the energy efficiency of CMOS circuits. Using a first-order model of the energy and delay of a CMOS circuit, we show that lowering the supply and threshold voltage is generally advantageous, especially when the transistors are velocity saturated and the nodes have a high activity factor, In fact, for modern submicron technologies, this simple analysis suggests optimal energy efficiency at supply voltages under 0.5 V. Other process and circuit parameters have almost no effect on this optimal operating point. If there is some uncertainty in the value of the threshold or supply voltage, however, the power advantage of this very low voltage operation diminishes. Therefore, unless active feedback is used to control the uncertainty, in the future the supply and threshold voltage will not decrease drastically, but rather will continue to scale down to maintain constant electric fields.
This paper describes an analytical model for the access and cycle times of on-chip direct-mapped and set-associative caches. The inputs to the model are the cache size, block size, and … This paper describes an analytical model for the access and cycle times of on-chip direct-mapped and set-associative caches. The inputs to the model are the cache size, block size, and associativity, as well as array organization and process parameters. The model gives estimates that are within 6% of Hspice results for the circuits we have chosen. This model extends previous models and fixes many of their major shortcomings. New features include models for the tag array, comparator, and multiplexor drivers, nonstep stage input slopes, rectangular stacking of memory subarrays, a transistor-level decoder model, column-multiplexed bitlines controlled by an additional array organizational parameter, load-dependent size transistors for wordline drivers, and output of cycle times as well as access times. Software implementing the model is available via ftp.
Bidirectional adaptive body bias (ABB) is used to compensate for die-to-die parameter variations by applying an optimum pMOS and nMOS body bias voltage to each die which maximizes the die … Bidirectional adaptive body bias (ABB) is used to compensate for die-to-die parameter variations by applying an optimum pMOS and nMOS body bias voltage to each die which maximizes the die frequency subject to a power constraint. Measurements on a 150 nm CMOS test chip which incorporates on-chip ABB, show that ABB reduces variation in die frequency by a factor of seven, while improving the die acceptance rate. An enhancement of this technique, that compensates for within-die parameter variations as well, increases the number of dies accepted in the highest frequency bin. ABB is therefore shown to provide bin split improvement in the presence of increasing process parameter variations.
In MOS integrated circuits, signals may propagate between stages with fanout. The exact calculation of signal delay through such networks is difficult. However, upper and lower bounds for delay that … In MOS integrated circuits, signals may propagate between stages with fanout. The exact calculation of signal delay through such networks is difficult. However, upper and lower bounds for delay that are computationally simple are presented in this paper. The results can be used 1) to bound the delay, given the signal threshold, or 2) to bound the signal voltage, given a delay time, or 3) certify that a circuit is "fast enough," given both the maximum delay and the voltage threshold.
In this paper we investigate possible ways to improve the energy efficiency of a general purpose microprocessor. We show that the energy of a processor depends on its performance, so … In this paper we investigate possible ways to improve the energy efficiency of a general purpose microprocessor. We show that the energy of a processor depends on its performance, so we chose the energy-delay product to compare different processors. To improve the energy-delay product we explore methods of reducing energy consumption that do not lead to performance loss (i.e. wasted energy), and explore methods to reduce delay by exploiting instruction level parallelism. We found that careful design reduced the energy dissipation by almost 25%. Pipelining can give approximately a 2/spl times/ improvement in energy-delay product. Superscalar issue, however, does not improve the energy-delay product any further since the overhead required offsets the gains in performance. Further improvements will be hard to come by since a large fraction of the energy (50-80%) is dissipated in the clock network and the on-chip memories. Thus, the efficiency of processors will depend more on the technology being used and the algorithm chosen by the programmer than the micro-architecture.
The demand for low-voltage, low drop-out (LDO) regulators is increasing because of the growing demand for portable electronics, i.e., cellular phones, pagers, laptops, etc. LDO's are used coherently with dc-dc … The demand for low-voltage, low drop-out (LDO) regulators is increasing because of the growing demand for portable electronics, i.e., cellular phones, pagers, laptops, etc. LDO's are used coherently with dc-dc converters as well as standalone parts. In power supply systems, they are typically cascaded onto switching regulators to suppress noise and provide a low noise output. The need for low voltage is innate to portable low power devices and corroborated by lower breakdown voltages resulting from reductions in feature size. Low quiescent current in a battery-operated system is an intrinsic performance parameter because it partially determines battery life. This paper discusses some techniques that enable the practical realizations of low quiescent current LDO's at low voltages and in existing technologies. The proposed circuit exploits the frequency response dependence on load-current to minimize quiescent current flow. Moreover, the output current capabilities of MOS power transistors are enhanced and drop-out voltages are decreased for a given device size. Other applications, like dc-dc converters, can also reap the benefits of these enhanced MOS devices. An LDO prototype incorporating the aforementioned techniques was fabricated. The circuit was operable down to input voltages of 1 V with a zero-load quiescent current flow of 23 /spl mu/A. Moreover, the regulator provided 18 and 50 mA of output current at input voltages of 1 and 1.2 V, respectively.
Very high-speed computers may be classified as follows: 1) Single Instruction Stream-Single Data Stream (SISD) 2) Single Instruction Stream-Multiple Data Stream (SIMD) 3) Multiple Instruction Stream-Single Data Stream (MISD) 4) … Very high-speed computers may be classified as follows: 1) Single Instruction Stream-Single Data Stream (SISD) 2) Single Instruction Stream-Multiple Data Stream (SIMD) 3) Multiple Instruction Stream-Single Data Stream (MISD) 4) Multiple Instruction Stream-Multiple Data Stream (MIMD). "Stream," as used here, refers to the sequence of data or instructions as seen by the machine during the execution of a program. The constituents of a system: storage, execution, and instruction handling (branching) are discussed with regard to recent developments and/or systems limitations. The constituents are discussed in terms of concurrent SISD systems (CDC 6600 series and, in particular, IBM Model 90 series), since multiple stream organizations usually do not require any more elaborate components. Representative organizations are selected from each class and the arrangement of the constituents is shown.
Variability in digital integrated circuits makes timing verification an extremely challenging task. In this paper, a canonical first order delay model is proposed that takes into account both correlated and … Variability in digital integrated circuits makes timing verification an extremely challenging task. In this paper, a canonical first order delay model is proposed that takes into account both correlated and independent randomness. A novel linear-time block-based statistical timing algorithm is employed to propagate timing quantities like arrival times and required arrival times through the timing graph in this canonical form. At the end of the statistical timing, the sensitivities of all timing quantities to each of the sources of variation are available. Excessive sensitivities can then be targeted by manual or automatic optimization methods to improve the robustness of the design. This paper also reports the first incremental statistical timer in the literature which is suitable for use in the inner loop of physical synthesis or other optimization programs. The third novel contribution of this paper is the computation of local and global criticality probabilities. For a very small cost in CPU time, the probability of each edge or node of the timing graph being critical is computed. Numerical results are presented on industrial ASIC chips with over two million logic gates.
Asymptotic waveform evaluation (AWE) provides a generalized approach to linear RLC circuit response approximations. The RLC interconnect model may contain floating capacitors, grounded resistors, inductors, and even linear controlled sources. … Asymptotic waveform evaluation (AWE) provides a generalized approach to linear RLC circuit response approximations. The RLC interconnect model may contain floating capacitors, grounded resistors, inductors, and even linear controlled sources. The transient portion of the response is approximated by matching the initial boundary conditions and the first 2q-1 moments of the exact response to a lower-order q-pole model. For the case of an RC tree model, a first-order AWE approximation reduces to the RC tree methods.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
Technology trends and especially portable applications drive the quest for low-power VLSI design. Solutions that involve algorithmic, structural or physical transformations are sought. The focus is on developing low-power circuits … Technology trends and especially portable applications drive the quest for low-power VLSI design. Solutions that involve algorithmic, structural or physical transformations are sought. The focus is on developing low-power circuits without affecting too much the performance (area, latency, period). For CMOS circuits most power is dissipated as dynamic power for charging and discharging node capacitances. This is why many promising results in low-power design are obtained by minimizing the number of transitions inside the CMOS circuit. While it is generally accepted that because of the large capacitances involved much of the power dissipated by an IC is at the I/O little has been specifically done for decreasing the I/O power dissipation. We propose the bus-invert method of coding the I/O which lowers the bus activity and thus decreases the I/O peak power dissipation by 50% and the I/O average power dissipation by up to 25%. The method is general but applies best for dealing with buses. This is fortunate because buses are indeed most likely to have very large capacitances associated with them and consequently dissipate a lot of power.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
Reductions in CMOS SRAM cell static noise margin (SNM) due to intrinsic threshold voltage fluctuations in uniformly doped minimum-geometry cell MOSFETs are investigated for the first time using compact physical … Reductions in CMOS SRAM cell static noise margin (SNM) due to intrinsic threshold voltage fluctuations in uniformly doped minimum-geometry cell MOSFETs are investigated for the first time using compact physical and stochastic models. Six sigma deviations in SNM due to intrinsic fluctuations alone are projected to exceed the nominal SMM for sub-100-nm CMOS technology generations. These large deviations pose severe barriers to scaling of supply voltage, channel length, and transistor count for conventional 6T SRAM-dominated CMOS ASICs and microprocessors.
Adiabatic switching is an approach to low-power digital circuits that differs fundamentally from other practical low-power techniques. When adiabatic switching is used, the signal energies stored on circuit capacitances may … Adiabatic switching is an approach to low-power digital circuits that differs fundamentally from other practical low-power techniques. When adiabatic switching is used, the signal energies stored on circuit capacitances may be recycled instead of dissipated as heat. We describe the fundamental adiabatic amplifier circuit and analyze its performance. The dissipation of the adiabatic amplifier is compared to that of conventional switching circuits, both for the case of a fixed voltage swing and the case when the voltage swing can be scaled to reduce power dissipation. We show how combinational and sequential adiabatic-switching logic circuits may be constructed and describe the timing restrictions required for adiabatic operation. Small chip-building experiments have been performed to validate the techniques and to analyse the associated circuit overhead.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
Traditional adaptive methods that compensate for PVT variations need safety margins and cannot respond to rapid environmental changes. In this paper, we present a design (RazorII) which implements a flip-flop … Traditional adaptive methods that compensate for PVT variations need safety margins and cannot respond to rapid environmental changes. In this paper, we present a design (RazorII) which implements a flip-flop with in situ detection and architectural correction of variation-induced delay errors. Error detection is based on flagging spurious transitions in the state-holding latch node. The RazorII flip-flop naturally detects logic and register SER. We implement a 64-bit processor in 0.13 mum technology which uses RazorII for SER tolerance and dynamic supply adaptation. RazorII based DVS allows elimination of safety margins and operation at the point of first failure of the processor. We tested and measured 32 different dies and obtained 33% energy savings over traditional DVS using RazorII for supply voltage control. We demonstrate SER tolerance on the RazorII processor through radiation experiments.
Approximate computing trades off computation quality with effort expended, and as rising performance demands confront plateauing resource budgets, approximate computing has become not merely attractive, but even imperative. In this … Approximate computing trades off computation quality with effort expended, and as rising performance demands confront plateauing resource budgets, approximate computing has become not merely attractive, but even imperative. In this article, we present a survey of techniques for approximate computing (AC). We discuss strategies for finding approximable program portions and monitoring output quality, techniques for using AC in different processing units (e.g., CPU, GPU, and FPGA), processor components, memory technologies, and so forth, as well as programming frameworks for AC. We classify these techniques based on several key characteristics to emphasize their similarities and differences. The aim of this article is to provide insights to researchers into working of AC techniques and inspire more efforts in this area to make AC the mainstream computing approach in future systems.
Consider a weighted undirected graph and its corresponding Laplacian matrix, possibly augmented with additional diagonal elements corresponding to self-loops. The Kron reduction of this graph is again a graph whose … Consider a weighted undirected graph and its corresponding Laplacian matrix, possibly augmented with additional diagonal elements corresponding to self-loops. The Kron reduction of this graph is again a graph whose Laplacian matrix is obtained by the Schur complement of the original Laplacian matrix with respect to a specified subset of nodes. The Kron reduction process is ubiquitous in classic circuit theory and in related disciplines such as electrical impedance tomography, smart grid monitoring, transient stability assessment, and analysis of power electronics. Kron reduction is also relevant in other physical domains, in computational applications, and in the reduction of Markov chains. Related concepts have also been studied as purely theoretic problems in the literature on linear algebra. In this paper we analyze the Kron reduction process from the viewpoint of algebraic graph theory. Specifically, we provide a comprehensive and detailed graph-theoretic analysis of Kron reduction encompassing topological, algebraic, spectral, resistive, and sensitivity analyses. Throughout our theoretic elaborations we especially emphasize the practical applicability of our results to various problem setups arising in engineering, computation, and linear algebra. Our analysis of Kron reduction leads to novel insights both on the mathematical and the physical side.
Abstract Approximate computing represents a computational paradigm that trades off a slight reduction in accuracy for significant performance improvements. One of the fundamental operations that can leverage approximate techniques is … Abstract Approximate computing represents a computational paradigm that trades off a slight reduction in accuracy for significant performance improvements. One of the fundamental operations that can leverage approximate techniques is multiplication, which is used substantially in applications like image/video processing and machine learning. This work proposes an approximate 8-bit multiplier design for FPGA-based circuits. This multiplier, by exploiting the FPGA primitives, demonstrates excellent performance regarding error metrics, critical path delay, and power dissipation with minimal LUT utilization. More precisely, the proposed design reduces LUT usage by 43% and PDP by 59% compared to the exact multiplier while incurring a mean error distance of only 102.57. The proposed approximate multiplier is used in two image processing applications to assess the actual advantages in real-world applications. The proposed design achieves a reasonable PSNR in the image processing flow, demonstrating high-quality results with a low error rate.
Abstract As modern SRAM architectures face increasing demands for high performance and energy efficiency, the proposed 12T SRAM cell offers substantial improvements over conventional designs. This architecture integrates a dynamic … Abstract As modern SRAM architectures face increasing demands for high performance and energy efficiency, the proposed 12T SRAM cell offers substantial improvements over conventional designs. This architecture integrates a dynamic word line boosting technique during write operations, significantly enhancing write speed while reducing energy consumption. A data dependent supply voltage scheme further optimizes power efficiency by adjusting the supply voltage based on stored data. To minimize static power and leakage, the design employs several low-power techniques, including high-threshold voltage (HVT) transistors, transistor stacking, and ground gating. A read buffer is incorporated to improve read stability without the overhead of an additional bit line. Simulation results demonstrate that the static power consumption of consider SRAM cells 11T, 10T, 12T_SRL, 12T_ST, 8T, and 6T is 1.27×, 3.38×, 1.87×, 2.96×, 5.81×, and 5.67× higher, respectively, compared to the proposed 12T design. Furthermore, the Write Static Noise Margin (WSNM) of these cells is found to be 1.53×, 1.57×, 2.23×, 2.23×, 1.96×, and 1.96× lower, respectively. Additionally, we proposed 12Tcell with boosted word line. The proposed 12T with boosted word line achieves the highest WSNM and demonstrates the lowest read and write energy consumption compared to all consider cell with least static power dissipation making it highly suitable for low-power, high-reliability memory applications.
ABSTRACT The energies absorbed by ideal switches in general capacitor‐switch circuits, both without and with topological degeneracy, are investigated through the so‐called asymptotic approach. The core of this approach (already … ABSTRACT The energies absorbed by ideal switches in general capacitor‐switch circuits, both without and with topological degeneracy, are investigated through the so‐called asymptotic approach. The core of this approach (already used by the author to address the well‐known two‐capacitor paradox) is the embedding of each ideal switch into a family of non‐ideal switches with finite transition time, which converge to the ideal one when this parameter tends to zero. The conclusion of the investigation, performed through extensive use of graph theory and matrix calculus, is that the energy calculation is feasible, provided that each ideal switch is accompanied by information on its half‐rise resistance. This result is the main contribution and novelty of the paper. To illustrate the method, two examples are treated in detail: a series circuit with three capacitors and one switch, and a ladder circuit with three capacitors and two switches (without and with an additional bridge capacitor).
I.D. Jitaru | REVUE ROUMAINE DES SCIENCES TECHNIQUES — SÉRIE ÉLECTROTECHNIQUE ET ÉNERGÉTIQUE
In this paper, a novel solution is presented to enable zero-voltage switching (ZVS) under any operating condition, utilizing transformers with extremely low leakage inductance. The proposed methods ensure that the … In this paper, a novel solution is presented to enable zero-voltage switching (ZVS) under any operating condition, utilizing transformers with extremely low leakage inductance. The proposed methods ensure that the primary switches turn on at zero voltage while the secondary rectifiers turn off at zero current. While numerous ZVS techniques for single-ended forward topologies have been proposed over the years, most rely on transformer leakage inductance to delay the current flow to the secondary. Although these approaches achieve zero-voltage switching (ZVS) for the primary switches, they fail to achieve zero-current switching (ZCS) in the secondary rectifiers. Additionally, a larger leakage inductance reduces the effective duty cycle, often necessitating a turns ratio adjustment to maintain regulation at the minimum input voltage. This adjustment increases the primary RMS current and higher voltage stress in the secondary. In contrast, this paper presents a ZVS solution specifically designed for forward converters with low leakage inductance, making it particularly suitable for wide input voltage ranges, high-current applications, and high-frequency operation.
<title>Abstract</title> This paper presents the design and comprehensive analysis of a low-power 2X2 SRAM array using CMOS 90nm technology. The proposed system is targeting energy constrained applications such as mobile … <title>Abstract</title> This paper presents the design and comprehensive analysis of a low-power 2X2 SRAM array using CMOS 90nm technology. The proposed system is targeting energy constrained applications such as mobile and embedded systems. The implemented SRAM architecture incorporates a 10T with stacking cell, optimized word line and bit line schemes, and efficient row and column decoders. Robust read and write functionalities are ensured through the integration of precharge circuits, a differential sense amplifier, and write drivers. To minimize power dissipation, transistor stacking techniques are strategically employed within the memory array. The performance of the designed SRAM is rigorously evaluated through simulations using Cadence Virtuoso and Spectre, focusing on critical parameters including hold stability, read stability, write margin, and power consumption. The simulation results demonstrate significant enhancements in power efficiency, read access time, and noise immunity, highlighting the suitability of this SRAM design for energy-efficient systems.
Approximate circuits have become ubiquitous in error-resilient applications. These circuits provide large reductions in area, power, and delay at the cost of erroneous computations. The error-resilient applications produce acceptable output … Approximate circuits have become ubiquitous in error-resilient applications. These circuits provide large reductions in area, power, and delay at the cost of erroneous computations. The error-resilient applications produce acceptable output quality, even after the introduction of erroneous computations. However, we observed that the error resilience of an application varies widely with respect to the applied inputs. Since prior works have mostly focused on using samples from a uniform distribution while designing the approximate circuits, they are unable to exploit input aware properties to design optimal circuits. Hence, in this work, we bridge this gap and propose Formally Verified Library of Input Data Aware Approximate Circuits (FV-LIDAC) . FV-LIDAC is the first formally verified library of input distribution aware approximate arithmetic circuits. We use three of the most widely occurring distributions, namely uniform, normal, and exponential distributions, to show that optimal design sets are heavily dependent on the input data. FV-LIDAC chooses the best designs among millions of functional approximated adder and multiplier circuits, depending upon the inputs. Since there are no existing input-aware approximate circuit libraries, we compared FV-LIDAC against state-of-the-art input-unaware EvoApproxLib, to further highlight the need for FV-LIDAC. Additionally, we perform case studies on real-world applications to further highlight the improvement over state-of-the-art. We aim to make the Pareto-optimal designs available as open source to stimulate further research.
Bathula Nagarjuna | International Journal for Research in Applied Science and Engineering Technology
This work presents the design and optimization of a RISC-based Arithmetic Logic Unit (ALU), a critical component in modern computing architectures that require efficient computation with minimal power consumption and … This work presents the design and optimization of a RISC-based Arithmetic Logic Unit (ALU), a critical component in modern computing architectures that require efficient computation with minimal power consumption and reduced silicon area. Through a systematic approach that incorporates architectural modeling, RTL implementation, and synthesis using Cadence Genus, the study emphasizes the transformation of high-level designs into optimized silicon realizations capable of meeting stringent performance, area, and power constraints. The design leverages the principles of Reduced Instruction Set Computing (RISC), focusing on a simplified instruction set that allows for faster execution cycles and enhanced scalability. The ALU architecture is rigorously verified through extensive simulations to ensure functional correctness before synthesis, showcasing improvements in logic delays and a significant reduction in hardware footprint. The results demonstrate a well-balanced design that not only achieves high performance but also adheres to the efficiency demands of embedded systems and application-specific integrated circuits (ASICs). Additionally, the project highlights the critical role of synthesis-driven methodologies in facilitating the optimization of digital designs and lays the groundwork for future enhancements, including adaptive optimization strategies and the exploration of emerging technologies. This research contributes to the ongoing evolution of processor design by delivering a robust ALU framework tailored for high-performance applications, positioning it as an asset for both academic and industrial implementation in the ever- expanding landscape of digital technology.
Abstract The paper presents a novel architecture of approximate adder and a novel generic error analysis method. The proposed architecture judiciously make use of time axis based parallelism of components … Abstract The paper presents a novel architecture of approximate adder and a novel generic error analysis method. The proposed architecture judiciously make use of time axis based parallelism of components to improve delay and simultaneously improvement in error parameters due to adaptive increase in group-size of carry generating blocks. The synthesis of architectures for same bit length has shown area, power, and delay improvement by 4.91%, 5.59%, 14.92% respectively with respect to state of the art architectures based on truncation of carry chain when synthesized under same constraints and same operating conditions. In comparison to ETA-I, ETA-II and GeAr, proposed method has shown improvement in delay by 9%, 17.9% and 21.3% respectively. Error analysis is done for proposed adder using random probabilistic method and generic analysis method. The generic error analysis method has shown error parameter results are in close agreement with values found with application of binary random numbers, with very large sample size, for all adder architectures with different size, group-size and window-size. Generic analysis method and random probabilistic method based error rate calculation varies by the least, with a value of 0.49% and a maximum of 13.85%.
Abstract Static Random-Access Memory (SRAM) used in cache memories faces significant power challenges due to increased leakage power. To minimize the overall power dissipation of memory units, SRAM cells should … Abstract Static Random-Access Memory (SRAM) used in cache memories faces significant power challenges due to increased leakage power. To minimize the overall power dissipation of memory units, SRAM cells should be designed to consume less power. The design of a low-power SRAM cell is proposed, utilizing the multi-threshold CMOS (MTCMOS) technique. The work builds upon, identifying a gap in the literature related to efficient power management in SRAM cells. The methodology employed in this work involves integrating the MTCMOS technique into the SRAM cells. Power gating is implemented by adding control signals and transistors to selectively activate or deactivate the power supply to the SRAM cells during idle states. The proposed cell is implemented using the Cadence Virtuoso tool, utilizing the GPDK 45nm technology library. The simulation results were analyzed using Cadence Spectre. The power analysis performed with Spectre shows a significant power reduction of 78.54% and 28.86% for the MTCMOS-based 10T SRAM cells when compared to both 6T SRAM and existing MTCMOS-based 8T SRAM cells, respectively. Power consumption during sleep time is effectively minimized, improving power efficiency and stability. The results are quantified and analyzed, highlighting the performance benefits of the proposed SRAM cell for low-power designs.
Majid Amini‐Valashani , Sattar Mirzakuchaki | Iranian Journal of Science and Technology Transactions of Electrical Engineering
<title>Abstract</title> This Special Issue presents research focused on the development of Ultra-Low-Power (ULP) Integrated Circuits (ICs) designed to operate within stringent power budgets, aiming to reduce reliance on batteries. These … <title>Abstract</title> This Special Issue presents research focused on the development of Ultra-Low-Power (ULP) Integrated Circuits (ICs) designed to operate within stringent power budgets, aiming to reduce reliance on batteries. These advancements are critical to enabling the Internet of Things (IoT), where interconnected devices exchange data to improve quality of life.[14] The increasing adoption of Internet of Things (IoT) devices has amplified the need for digital circuits that operate with minimal power consumption. As many IoT systems are battery-powered and deployed in remote environments, power efficiency has become a critical design requirement. This paper explores the design and development of power-efficient digital circuits suitable for IoT applications. Key techniques such as clock gating, power gating, and dynamic voltage scaling are discussed. A case study of a digital interface for a temperature sensor is presented, demonstrating notable reductions in both dynamic and leakage power. Simulation results validate the proposed strategies, showing that significant power savings can be achieved without adversely affecting performance. These methodologies offer scalable solutions for future energy-conscious IoT system designs.
<title>Abstract</title> The increasing demand for efficient and clean power distribution has made power quality a critical concern in modern electrical systems. Among the main challenges affecting power quality, harmonic pollution … <title>Abstract</title> The increasing demand for efficient and clean power distribution has made power quality a critical concern in modern electrical systems. Among the main challenges affecting power quality, harmonic pollution caused by nonlinear loads—particularly static converters such as rectifiers—plays a significant role in distorting voltage waveforms and degrading system performance. Traditional diode-based rectifiers, despite their simplicity and low cost, suffer from unidirectional power flow, poor power factor, and high current harmonic distortion. To address these limitations, Pulse Width Modulation (PWM) rectifiers have emerged as a superior alternative, offering bidirectional power transfer, near-unity power factor, and reduced harmonic distortion. However, their performance heavily depends on the control strategy employed. This paper investigates Voltage-Oriented Control (VOC), a high-performance control technique that ensures precise current regulation and enhanced dynamic response. The study evaluates the effectiveness of VOC in minimizing harmonic pollution and improving power quality in electrical distribution networks. Through simulation and analysis, the proposed control strategy demonstrates significant improvements in reducing total harmonic distortion (THD) and maintaining stable DC bus voltage. The results highlight VOC as a robust solution for enhancing the efficiency and reliability of PWM rectifiers in modern power systems.
With the evolving modern-day communication applications, there is a need for an effectively improved performance in multiplication operations. In today's scenario, multiplication operations based on Vedic mathematics have the primary … With the evolving modern-day communication applications, there is a need for an effectively improved performance in multiplication operations. In today's scenario, multiplication operations based on Vedic mathematics have the primary advantage that the propagation delay due to a larger number of input bits is reduced compared to other multipliers. Higher speed Vedic multipliers, especially based on Urdhva Tiryagbhyam (vertically and crosswise) sutra, perform multiplication in a way that allows parallel processing with reduced delay. Compared to conventional multipliers like array or Booth multipliers, Vedic multipliers may have less area and power, depending on implementation. In this work, a high-speed 64-bit reversible Vedic multiplier is proposed using five different adders, namely reversible ripple carry adder (RRCA), reversible carry look-ahead adder (RCLA), reversible carry save adder (RCSA), reversible carry bypass or carry skip adder (RCSKA)adder, and reversible carry select adder (RCSLA). The main objective of utilizing logic optimization in reversible logic along with the Vedic multiplier is to develop low-power and high-speed digital circuits. The proposed n-bit reversible Vedic multiplier is simulated using Xilinx Vivado 2019.1 and synthesized in the Cadence EDA tool in 90 nm and 180 nm technology. The proposed 16-bit reversible Vedic multipliers using the proposed 2-bit reversible multiplier provide 24% and 28% less propagation delay than the related work Mohana Priya et al. (Int. J. Syst. Assur. Eng. Manag. 14:829-835, 2023). The 16-bit reversible Vedic multiplier proposed using the existing 2-bit reversible multiplier provides 53% lesser area and 52% less power than the reference work Deepa et al. (Sadhana 44:197, 2019). Similarly, the proposed 32-bit reversible Vedic multiplier offers 15% better delay than (Padma et al. in Comput. Electr. Eng. 92:107178, 2021), 53% less area, and 45% less power than (Deepa et al.in Sadhana 44:197, 2019). Using the proposed reversible Vedic multiplier, a 32-bit MAC unit is designed and implemented using Cadence 90 nm and 180 nm technology. Thus, the proposed work can be applied to the most promising fields such as Microprocessors to design MAC units, to find the convolution in Digital signal processing applications, Communication, RF sensing applications, etc.
Power efficiency has become a paramount concern in modern VLSI circuits, particularly for low-power applica tions. Flip-flops and clock distribution networks (CDNs) contribute significantly to dynamic power consumption, thus heavily … Power efficiency has become a paramount concern in modern VLSI circuits, particularly for low-power applica tions. Flip-flops and clock distribution networks (CDNs) contribute significantly to dynamic power consumption, thus heavily impactingsystem performance. This paper proposes a novel Pulse-Triggered Flip-Flop (P-FF) architecture that incorporates an advanced pulse control mechanism combined with a NOR-based clock gating technique to address power inefficiencies. The proposed design minimizes the number of stacked NMOS transistors, employs conditional pulse enhancement during critical transitions, and strategically gates the clock to reduce unnecessary switching activity. Simulations conducted using Tanner EDA tools at a 250 nm CMOS technology node across a wide input voltage range (-1V to 5V) demonstrate that the proposed flip-flop achieves up to 89% reduction in power consumption compared to conventional flip-flop designs such as MHLFF, IMFF, and SECCER-FF. The results further highlight that the proposed P- FF provides superior dynamic power savings, reduced leaka ge, simplified structure, and robust performance across varying operating conditions. These characteristics make the proposed flip-flop an ideal candidate for next-generation low-power VLSI applications, including portable electronics, embedded systems, and high-performance computing platforms. Key Words: low power, CMOS, SECCER-FF, EDA, P-FF.
Aims: To design and implement an optimized non-linear DSP-based control strategy using one-cycle control (OCC) for a boost converter aimed at mitigating voltage fluctuations in renewable energy systems integrated with … Aims: To design and implement an optimized non-linear DSP-based control strategy using one-cycle control (OCC) for a boost converter aimed at mitigating voltage fluctuations in renewable energy systems integrated with DC microgrids, and to evaluate its performance under dynamic load and input voltage conditions. Study Design: Experimental validation with simulation-based pre-testing. Place and Duration of Study: Department of Electrical Engineering, [Ankara Yildirim Beyazit University], simulations conducted using MATLAB/Simulink, and hardware implementation tested using the TMS320F28069M digital signal processor. The study was carried out over a 6-month period in 2024. Methodology: An OCC duty cycle pre-calculation method was implemented on a DSP-based controller (TMS320F28069M) to enable real-time adjustments to the boost converter’s duty cycle in response to variations in input voltage and load conditions. MATLAB/Simulink simulations were first used to evaluate performance, followed by experimental testing under input transitions from 13V to 24V and vice versa, and load shifts from half to full load and reverse. Results: The proposed controller achieved a steady-state voltage with minimal ripple and overshoot: 0.3 Vpp / 14.3% for 13–24V and 0.5 Vpp / -12.31% for 24–13V transitions, with settling times of 3.6 ms and 1.9 ms, respectively. Load transition tests (half to full and full to half) resulted in settling times of 1.2 ms and 1.6 ms, with voltage overshoots of 8.33% and -6.64%, respectively. The system reached a peak efficiency of 90.26%. Conclusion: The study demonstrates that OCC-based non-linear DSP control methods offer fast, stable, and efficient regulation of boost converters in renewable energy applications, making them highly suitable for DC microgrid integration.
Most rectifiers using AC grid voltage assume that the voltage is ideal and has no distortion. However, in high-power systems such as water electrolysis, the grid voltage can be distorted. … Most rectifiers using AC grid voltage assume that the voltage is ideal and has no distortion. However, in high-power systems such as water electrolysis, the grid voltage can be distorted. This situation is called a weak grid. In weak grids, the switching of rectifiers causes voltage distortion. Distorted voltage causes phase errors during observation, so it is important to measure voltage without distortion. There are two common methods to reduce errors during observation. One is using a hardware Low-Pass Filter (LPF) to reduce high-frequency switching distortion. The other is using a Second-Order Generalized Integrator (SOGI) Phase-Locked Loop (PLL) to separate the distorted component. Both methods are commonly used, but their performance changes depending on how they are applied. This paper compares the distortion reduction of the hardware LPF and the error caused by the digital method of the SOGI-PLL. Simulation results show that the hardware LPF reduces distortion by about 75 %, and the SOGI-PLL can have up to 6.7 % error depending on the digital method. These results are verified through PSIM simulation.
Sheetal Kaul | European Journal of Computer Science and Information Technology
This technical article explores various approaches for optimizing Power, Performance, and Area (PPA) in digital design, addressing the critical balancing act required in modern semiconductor development. The discussion spans multiple … This technical article explores various approaches for optimizing Power, Performance, and Area (PPA) in digital design, addressing the critical balancing act required in modern semiconductor development. The discussion spans multiple dimensions of optimization, beginning with architectural techniques like multi-voltage design and clock gating, followed by effective methods including Design Space Research and technology mapping. Physical design considerations involving FinFET technology and strategic floorplanning are examined, alongside Dynamic Voltage and Frequency Scaling for real-time power management. Advanced techniques leveraging machine learning and approximate computing complete the exploration, demonstrating how emerging technologies are reshaping traditional optimization paradigms. Through each dimension, the article highlights the essential interplay between competing metrics and presents strategies for achieving optimal trade-offs in contemporary chip design.