Grad Paper BIST

EE620 Power reduction methodologies for Built-in Self-Test (BIST) based applications
Power reduction methodologies for Built-in Self-Test (BIST) based applications.

Juni Khisha, Department of Electrical Engineering, RIT
AbstractTest and validation of a design has become as important as the design itself. Therefore, modern circuit designs invariably incorporate some form of design for test (DFT) technique. Among their many limitations, increased power dissipation during test is one of the foremost concerns for many DFT techniques in use today. In this write-up, Ill discuss a couple of power reduction techniques as applicable to Built-In Self-Test (BIST) domain taken from the references [1] and [2]. Increasing correlation among the pseudo-random test vectors is one of the most effective and sought after methods today because it reduces the switching activity during test. The work in ref. [1] proposes a technique where the generated test vectors have significantly low transition rates. The ref. [2] proposes a novel technique where bits of vectors are selectively swapped and scanchains rearranged to achieve significant reduction in power consumption. Index TermsBuilt-in Self-Test (BIST), linear feedback shift register (LFSR), design for test (DFT), pseudo-random test generator, low power test, Bit-swapping, Scan-chain ordering.
I. INTRODUCTION
dissipation during test mode of a chip shoots up because of increased switching activity due to insertion of test vectors. This is a concern from at least two perspective; because of the volume of tests performed, it adds to the cost and energy footprint of the product and more importantly high power dissipation during test cycle can actually cause damage to the chip. If the peak dissipation exceeds the tolerance limit of the component under test, it will damage the circuit and higher average power in general during test will create hotspot across the chip and might compromise the reliability of the chip or its components [5]. The goal of a test is to achieve target test coverage or fault coverage (FC) with as minimum of test vectors as possible. Ideally a large number of test patterns which are completely random in nature would be able to detect all the faults in the circuit (i.e. 100% FC). But due to cost and feasibility issues the target FC is almost always below 100%. It is obvious that, in general, the more random the patterns are the more effective they are in detecting faults. However, a finite pattern can neither be truly random nor it is possible to generate truly random patterns from limited hardware resources. With these limitations, the test patterns that are used in practice are called pseudo-random sequences
OWER This write-up is submitted on December 12, 2013 to fulfill the requirements for the course Design of digital systems. J. Khisha is a graduate student in the Electrical Engineering department, Rochester Institute of Technology (RIT), NY 14623.
because they exhibit statistical randomness but are produced by a completely deterministic process (i.e. limited hardware) and therefore repeatable and reproducible. One of the most commonly used hardware for producing pseudo-random sequences is called a linear feedback shift register (LFSR). LFSRs for pseudo-random sequence generation have at least one primitive polynomial associated with it. A maximal length n-stage LFSR is one that is implemented with a primitive polynomial and produces 2n-1 distinct sequences. LFSRs implemented with other characteristic polynomials (i.e. nonprimitive polynomial) will produce transitions less than maximum length of 2n-1. An n-bit LFSR will be constructed with an n-bit shift register with a subset of its current states linearly (i.e. modulo 2/XOR) fed back (taps) according to the terms in its primitive polynomial. LFSRs find wide spread applications because of their simple structures, faster operation and presence of minimum combinational logic. BIST is one of the powerful DFT techniques where physical elements are incorporated inside the circuit to facilitate the testing process often making external ATEs redundant. Among other advantages of BIST is at-speed testing. In this scheme, a pseudo-random test generator (often an LFSR) generates test vectors and a multiple input signature register (MISR) is used to compact responses at the observation side [1]. However, BIST is also vulnerable to high power dissipation during test. Apart from the heightened switching activity, power consumed in the built in test circuitry and low correlation among the test vectors are some of the reasons for high power dissipation during test. One of the obvious methods for reducing power during test would be to reduce the test frequency than normal mode of operation. This technique is not practical for most applications because of the increase in test time and it fails to reduce the peak power reduction which is independent of the frequency of test. Scan chain-ordering technique is an effective technique that may reduce average power during the scan-in and scan-out periods but fail to improve the capture power during the actual test cycle. Other strategies for test power reduction include but not limited to test partitioning or scheduling, dual-speed LFSR, multiple scan chain and various test compression techniques. For most practical purposes none of the above techniques are used alone; all the practical low power techniques are invariably some set of combination of above or other techniques. As mentioned earlier, one of the requirements for effectiveness of test vectors is their low correlation or high degree of randomness which in turn means increased switching activity.
Figure 1 Pattern insertion Bipartite method [1].
Figure 2. Pattern insertion based on RI strategy [1].
Building upon on this conflicting requirements, low transition test patterns can be generated which minimize the number of transitions while at the same time maintain random nature to be able to effectively detect faults (desired FC). While most of the low transition techniques achieve reduction in one dimension (vertical direction, among consecutive test vectors) only, the technique implemented in [1] attacks the problem in two dimensions both in vertical and horizontal dimension (between adjacent bits of a test vector). In [2], the authors implemented a low power BIST technique that effectively combines a novel bit-swapping LFSR and scan-chain reordering scheme. The following two sections are taken mostly from the works of ref. [1] and [2] respectively. II. LOW-TRANSITION TEST PATTERN GENERATION FOR BIST BASED APPLICATIONS [1]. This work [1] proposes a low-transition linear feedback shift register (LT-LFSR) that can effectively reduce the average and peak power of a DUT during test by reducing the number of both inter-pattern and intra-pattern transitions. In this scheme, a conventional LFSR is modified so that it automatically inserts intermediate patterns between its original pairs in order to reduce number of transitions among the original pairs. The salient feature of this architecture is the generation of intermediate pattern to achieve low transitions between actual pairs of test patterns. The two methods that are employed are called Random Injection (RI) and Bipartite LFSR. A. Bipartite LFSR and RI Methods For an n-bit conventional LFSR, if the two consecutive test patterns are Ti and Ti+1 respectively, then the maximum number of transitions will be n when they are complements of each other. To reduce the number of transitions, the Bipartite (half-fixed) method insert an intermediate pattern Ti1 where half of its bits are identical to Ti and the other half to Ti+1 thereby reducing the maximum transitions to n/2. The downside of this Bipartite method is that randomness of the patterns deteriorates to Hmax=n/2[1]. In RI method, random bits R are inserted if corresponding bits in consecutive patterns differ. Due to the fact that such bits are randomly distributed and replaced by another random value the overall randomness is preserved. The Random Injection (RI) method has no adverse effect on the randomness of the patterns because it randomly injects bits in positions where . In brief, [1]
The following figure graphically illustrates the methods. B. Hardware implementation of RI and Bipartite techniques RI technique inserts a new test pattern Ti1 between two regular test patterns such that the sum of Primary Inputs (PIs) switching activities between Ti and Ti1 ( ) and Ti1 i+1 i and T ( ) equals the activities between T and Ti+1 ( ).
Therefore, it is reducing the total activities into two separate parts which reduces the patterns overall switching activity. In general, for an n-bit vector if there are originally m transitions ( ) between Ti and Ti+1[1], following conditions hold (In the best case 50% reduction):
An illustration of an example RI generator is shown below. In this structure R is a random bit that can be taken from the LFSR itself. In Bipartite technique, the intermediate patterns are such that the total number of transitions (between Ti and Ti1 and Ti1 and Ti+1) is reduced. For that purpose, each half of Ti1 is filled with half of Ti and Ti+1 [1]. As far as hardware implementation is concerned, a conventional LFSR is partitioned into two halves where each half is controlled by two non-overlapping signals. Additionally, an extra flip-flop is added between the two halves to facilitate correct operation when two halves switch their operation. Since only one half is operating at a given period, it greatly reduces power consumption compared to that of a conventional LFSR. These two techniques are combined and implemented simultaneously in a LFSR architecture call low-transition LFSR (LT-LFSR). This novel architecture generates three intermediate patterns between the original patterns which is conventionally suppose to increase test time 4 fold assuming that the generated intermediate patterns are non-detecting.
Figure 3. An RI intermediate test pattern generator [1].
EE620 Power reduction methodologies for Built-in Self-Test (BIST) based applications However, due to high degree of randomness of these intermediate patterns, the authors actually report no observable impact on fault coverage. That means that these intermediate patterns themselves are highly detecting. C. LT-LFSR Experimental Results The authors report exhaustive performance evaluation experiments comparing the conventional LFSR and LT-LFSR. The results are for both combinational and sequential ISCAS (85 and 89) benchmarks. Circuit elements were optimized with the target technology of TSMC 0.18m. They used the polynomial of different lengths (n) for both LFSR and LT-LFSR. Although the generated test patterns for LT-LFSR are four times that of regular LFSR (due to intermediate patterns generated) the test time never really quadruples because the target FC always achieved because of superior fault-coverage of even the intermediate patterns, i.e. all the detecting patterns (to hit a target FC) are generated much earlier. The additional circuit elements render the overall LT-LFSR to operate at a slightly lower frequency which has negligible impact on performance and speed. One of the great advantages of this architecture is its universality i.e. it is applicable to both test-per-clock (BIST) and test-per-scan (scan BIST) setup meaning it can be applied to both combination circuits and scan chains to sequential blocks. Because of the reduced transitions inside the patterns, it specifically suited to low power for scan chains. Because of the Bipartite mechanism, the power consumed by the LTLFSR itself is reduced significantly. While many of the power reduction architectures are circuit-dependent i.e. suited to specific CUT, the LT-LFSR is independent of the CUT and does not require any preprocessing to obtain a seed. The table 1 below shows the results of LFSR and LT-LFSR on ISCAS benchmarks (four largest from each). It is evident that the number of test patterns required to achieve a target fault coverage (FC*) for the LT-LFSR is at most +/-10 percent that of the LFSR. This is because the intermediate test patterns inserted were effective in detecting faults. The authors report that this trend is independent of seeds and polynomial used. LT-LFSR also achieves significant reduction in power compared to LFSR as expected. Instantaneous power (i.e. power surge between consecutive patterns) can be extremely harmful to the circuit under test. The authors show that for a given benchmark circuit the LT-LFSR violates the instantaneous power limit far less frequently than an LFSR. The area overhead of the LT-LFSR is only 13% increase
Figure 5. Pattern insertion technique of Bipartite LFSR [1].
Table 1. Fault coverage of LT-LFSR [1].
Table 2. Percentage Power reduction of LT-LFSR [1].
Table 3. Test overhead for LT-LFSR[1].
compared to that of a regular LFSR. The FSM that controls the test pattern generation has only 46 equivalent NAND gates leading to negligible increase in area overhead. III. BIT-SWAPPING LFSR AND SCAN CHAIN ORDERING FOR SCAN BASED BIST [2]. This work presents a new approach to reduce the number of transitions among test patterns. It is a combination of a novel pattern generator called Bit swapping LFSR (BS-LFSR) and scan chain re-ordering mechanism that achieves the reduction in number of transitions and thereby achieves low power
Figure 4. A conventional LFSR and Bipartite LFSR [1].
Figure 6. Reduction of instantaneous power violations [1].
EE620 Power reduction methodologies for Built-in Self-Test (BIST) based applications operation of scan based BIST. A. Bit-swapping LFSR The bit-swapping LFSR can be designed by combining a conventional LFSR and a 2x1 multiplexer (MUX) where the MUX is used to achieve the swapping of bits based on some algorithm. For an n-stage maximal length LFSR, the values of two adjacent cells are swapped if the current value of a third cell has a value of 1 (or 0) and keep unchanged if the third cell has a value of 0 (or 1). The output of the original cells is MUXed and linked with adjacent cells such that they can be swapped if the appropriate selection bit goes 0 (or 1). The selection bit is also an output of LFSR. If bit n is the selection bit and it goes 0, then bit 1 will be swapped with bit 2, bit 3 with bit 4 and bit n-1 with bit n-2. If bit n=1, then no swapping takes place [2]. The generated test vectors will be same as the original LFSR but of different order. Each swapping pair will have 25% reduction in number of transitions. However, the authors report some special configurations where if the cell that produces the selection bit is linked to one of the swapped cells through an XOR gate the reduction can be as high as 50%. However, the BS-LFSR achieves significant power saving only during test (average power) and during scanning in new test vectors (peak power). It fails to reduce overall peak power during scan out period of captured response or during the capture of a response in test cycle. To address this limitation, the authors combined BS-LFSR with a scan chain re-ordering scheme. B. Scan chain re-ordering mechanism This cell re-ordering algorithm basically reduces the number of transitions in the scan chain while scanning out the captured responses. This scheme when combined with BSLFSR addresses the average and peak power violation during scan out of captured responses. One of the major contributors to the total test power dissipation is during scan-in and scanout. The first step of this algorithm is to determine the appropriate connections of scan cells to minimize the transitions in the scan chain during scan-in and scan-out. The scan chain re-ordering process starts with an initial order of scan elements (flip-flops) and a set of test vectors. For each test vector in the test set, the total number of bit difference between each pair of flip-flops (which represents the number of transitions that might be generated) is calculated. Now, a completely undirected graph where each vertex represents a flip-flop and each edge represents a possible connection between two flip flops. Each edge is assigned a weight which
Figure 8. Example of weighted graph generation for 4 scan FF setup [4].
Figure 9. The oriented cyclic graph for the example above [4].
is total number of bit differences between two flip-flops for the complete test set. The problem amounts to finding a Hamiltonian cycle of minimum cost in the graph which is computationally very demanding problem [4]. Many approximation algorithms can be used to find a solution to this problem. One of the algorithms is called greedy algorithm that maintains a tradeoff between computation time and the efficiency of the solution [4]. Once the scan chain is derived, it can now be depicted as a simple cyclic graph. The next step now is to define the input and output scan cells that minimize the propagation of transition during shift operations. For an n-bit scan chain the possible solutions are n. At this stage, all the n solutions are evaluated in terms of weighted transitions generated during scan operations. The one that gives the lowest number of weighted transition is selected. C. BS-LFSR and scan-chain reordering performance The authors report an exhaustive full scan validation on ISCAS 89 benchmark circuits both in terms of fault coverage and power reduction. The results are reproduced in following two tables. As evident the BS-LFSR (both with and without scan chain ordering) achieves the target fault coverage on all the circuits without increase in required test length. This BS-LFSR scheme also shows superior performance in terms of power compared to conventional LFSR. For all of the
Figure 7. Bit swapping arrangement for an external LFSR [4].
Table 4. FC of BS-LFSR and LFSR [2].
appears that the LT-LFSR has slightly better fault coverage than the BS-LFSR. This conclusion can only be drawn by linearly projecting the result of one of them because the data given for LT-LFSR is for Primary Input (PI) length of 62 bit while that for BS-LFSR is of 31 bit. However, the percentage of power reduction for the same circuit in both cases is reported to be essentially the same. The BS-LFSR with scan chain ordering require no hardware overhead which can be very advantageous in many situations. Nevertheless, this should not prompt one to conclude that one method is better than the other. V. CONCLUDING REMARKS
Table. 5. Power saving of BS-LFSR scheme [2].
. circuits of the benchmark the saving in power consumption is significant as presented by the authors in the following table: The BS-LFSR scheme also has superior performance compared to other low power BIST schemes as reported in references [5] and [6] in terms of power saving. IV. DISCUSSION Both the works are highly related and addresses the problem of high power dissipation during test of BIST architectures. One of the most popular strategies for low power testing is modification of test vectors such that they produce less switching activity during test. Put another way, these techniques try to increase correlation among the test vectors while preserving the essential randomness necessary for effective fault detection. While the reference [1] is applicable to both combinational and sequential architectures, the focus of reference [2] on scan based (sequential) BIST architectures only. The LT-LFSR (reference [1]) generates low transition patterns by reducing transitions in two dimensions between consecutive vectors and within each pattern which makes it highly effective. To summarize the performance of the LTLFSR, the authors report more than 70% and 49% reduction in the average power and peak power respectively compared to a regular LFSR. The required hardware overhead to achieve this is less than 13%. These are of course without affecting the test time and FC. The low power test generation technique implemented in reference [2], also address the low transition vector generation problem from at least two dimensions (i.e. incorporates bit-swapping and scan chain ordering). They incorporate multi-dimensional optimization techniques to achieve the structure that gives lowest number of transitions. In summary, this scheme achieves up to 65% and 55% reduction in average and peak power compared to conventional LFSR. Though these two techniques address the same problem, it might not be practical to put forward a generalized comparison because essentially they are using completely different optimization algorithms. In the results provided, both of them uses a common standard test bench ISCAS 89. However, the measurement results provided invariably differ because of the difference in the seeds and polynomials used in LFSRs and difference in lengths of LFSRs. For the sake of comparison in this context, from the data provided for S13207 circuit (of ISCAS 89 benchmark) it
Like all the DFT techniques BIST is vulnerable to excessive power consumption during test period. This problem of heightened power dissipation can be addressed from multiple corners depending on the context of application and feasibility. The designer can either address the issue by optimization of test vectors or by modifying the architecture to avoid power violation during tests [5]. Presence of a multitude of low power test strategy makes it hard for inexperienced designer to choose from. However, there are some very critical considerations to take into account while selecting a test scheme. First and foremost is that the Fault Coverage (FC) and the test time must improve or at least remain unaltered. Hardware and its implementation overhead must remain acceptable. It goes without saying that the test strategy must not have any effect on the performance of the CUT. Both the LFSR architectures discussed here are highly effective for low power BIST operation, but caution must be exercised while selecting one over other for a specific application. REFERENCES
[1] M. Nourani, M. Tehranipoor and N. Ahmed Low-Transition Test Pattern Generation for BIST-based Applications, IEEE Trans. Computers., vol. 57, no. 3, pp. 303315, Mar. 2008. Abu-Issa, A.S. and Quigley, S.F.,Bit-Swapping LFSR and Scan-Chain Ordering: A Novel technique for Peak and Average-Power Reduction in Scan-Based BIST, IEEE Trans. On Computer-Aided Design of Integrated Circuits and Systems., vol. 28, no. 5, pp. 755759, May. 2009. A. Abu-Issa and S. Quigley, Bit-swapping LFSR for low-power BIST, Electron. Lett., vol. 44, no. 6, pp. 401402, Mar. 2008. Y. Bonhomme, P. Girard, C. Laundrault, and S. Pravossoudovitch, Power driven chaining of flip-flops in scan architectures, inProc. Int. Test Conf., Oct. 2002, pp. 796803. P. Girard, Survey of Low-Power Testing of VLSI Circuits,IEEE Design and Test of Computers,vol. 19, no. 3, pp. 80-90, May-June 2002. S. Wang and W. Wei, A technique to reduce peak current and average power dissipation in scan designs by limited capture, inProc. Asia South Pacific Des. Autom. Conf., Jan. 2007, pp. 810816. S. Wang and S. Gupta, LT-RTPG: A new test-per-scan BIST TPG for low switching activity,IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 25, no. 8, pp. 15651574, Aug. 2006. Juni Khisha is currently pursuing his MS degree in Electrical Engineering in Rochester Institute of Technology (RIT). He received his BSc degree in Electrical Engineering from Bangladesh University of Engineering and Technology in 2008. Before starting his graduate career, he had a brief stint as a Telecom Engineer. His research interest includes circuit design and Design for Test (DFT).
[2]
[3] [4]
[5] [6]
[7]

Grad Paper BIST

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Grad Paper BIST

Caricato da

Copyright:

Formati disponibili

EE620 Power reduction methodologies for Built-in Self-Test (BIST) based applications

Power reduction methodologies for Built-in Self-Test (BIST) based applications.

Figure 1 Pattern insertion Bipartite method [1].

Figure 2. Pattern insertion based on RI strategy [1].

Figure 3. An RI intermediate test pattern generator [1].

Table 1. Fault coverage of LT-LFSR [1].

Table 2. Percentage Power reduction of LT-LFSR [1].

Table 3. Test overhead for LT-LFSR[1].

Figure 4. A conventional LFSR and Bipartite LFSR [1].

Figure 6. Reduction of instantaneous power violations [1].

Figure 8. Example of weighted graph generation for 4 scan FF setup [4].

Figure 7. Bit swapping arrangement for an external LFSR [4].

Table 4. FC of BS-LFSR and LFSR [2].

Potrebbero piacerti anche