Sei sulla pagina 1di 11

This article has been accepted for inclusion in a future issue of this journal.

Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1

Improving Error Correction Codes for Multiple-Cell


Upsets in Space Applications
Joaquín Gracia-Morán , Luis J. Saiz-Adalid, Daniel Gil-Tomás, and Pedro J. Gil-Vicente, Member, IEEE

Abstract— Currently, faults suffered by SRAM memory Traditionally, error correction codes (ECCs) have been used
systems have increased due to the aggressive CMOS integra- to protect memory systems. Common ECCs employed to
tion density. Thus, the probability of occurrence of single-cell protect standard memories are single-error correction (SEC)
upsets (SCUs) or multiple-cell upsets (MCUs) augments. One
of the main causes of MCUs in space applications is cosmic or single-error-correction–double-error-detection (SEC–DED)
radiation. A common solution is the use of error correction codes [11]–[13]. SEC codes are able to correct an error in one
codes (ECCs). Nevertheless, when using ECCs in space applica- single memory cell. SEC–DED codes can correct an error in
tions, they must achieve a good balance between error coverage one single memory cell, as well as they can detect two errors
and redundancy, and their encoding/decoding circuits must be in two independent cells.
efficient in terms of area, power, and delay. Different codes have
been proposed to tolerate MCUs. For instance, Matrix codes use In critical applications, such as space applications, more
Hamming codes and parity checks in a bi-dimensional layout to complex and sophisticated codes are used [14]–[17], [19]–[21].
correct and detect some patterns of MCUs. Recently presented, For instance, Matrix code [17] is a well-known code that
column–line–code (CLC) has been designed to tolerate MCUs combines Hamming codes with parity check in a matrix,
in space applications. CLC is a modified Matrix code, based allowing the correction of two bits in error. Recently presented,
on extended Hamming codes and parity checks. Nevertheless,
a common property of these codes is the high redundancy column–line–code (CLC) [19] follows a similar approach, that
introduced. In this paper, we present a series of new low- is, it uses extended Hamming codes and parity bits to correct
redundant ECCs able to correct MCUs with reduced area, power, up to two adjacent bits in error.
and delay overheads. Also, these new codes maintain, or even The main problem when memory systems employ an ECC
improve, memory error coverage with respect to Matrix and is the redundancy required. The extra bits added are used to
CLC codes.
detect and/or correct the possible errors occurred. Also, redun-
Index Terms— Error correction codes (ECCs), fault tolerance, dant bits must be added for each data word stored in memory.
multiple-cell upsets (MCUs), reliability. In this way, the amount of storage occupied for redundant bit
scales with the memory capacity. For example, if an ECC with
I. I NTRODUCTION
100% of redundancy is employed in a 2-GB memory, only 1

P RESENTLY, the continued physical feature size down-


scaling of CMOS technology provides memory systems
with a great storage capacity. Nevertheless, this size decreasing
GB is available to store the payload (the “clean” data); the
remaining 1 GB is required for code bits.
In addition, the usage of an ECC implies overheads in
has also caused an augment in the memory fault rate [1], [2]. the area, power, and delay employed by the encoder and
With the present aggressive scaling, the memory cell critical decoder circuits. These overheads must be maintained as low
charge and the energy needed to provoke a single-event as possible, especially in space applications.
upset (SEU) in storage have been reduced [3]. As shown In this paper, we present a series of ECCs that greatly
by different experiments, in addition to traditional single-cell reduces the redundancy introduced, while maintaining, or even
upsets (SCUs), this energy reduction can provoke multiple-cell improving, memory error coverage. In addition, area, power,
upsets (MCUs), that is, simultaneous errors in more than one and delay overheads are also reduced. These new codes have
memory cell induced by a single particle hit [4]–[8]. been designed using the flexible unequal error control (FUEC)
In the case of space applications, the MCU problem methodology, developed by Saiz-Adalid et al. [31], where an
must be taken into account for the design of the corre- algorithm (and a tool) to design FUEC codes is introduced.
sponding fault tolerance methods, as space is an aggressive FUEC codes are an improvement of the well-known unequal
environment subjected to the impact of high-energy cosmic error control (UEC) codes [11]. Nevertheless, the FUEC
particles [4], [7], [9]. methodology can also find other kinds of codes. In this paper,
Manuscript received December 26, 2017; revised March 31, 2018; it is employed to find low redundancy codes. These novel
accepted May 11, 2018. This work was supported by the Spanish Govern- codes are different than those presented in [31]. They only
ment under the research Project TIN2016-81075-R. (Corresponding author: share the design methodology, but with different parameters.
Joaquín Gracia-Morán.)
The authors are with the Instituto ITACA, Universitat Politèc- In this way, by using the tool, we can generate the parity-
nica de València, 46022 Valencia, Spain (e-mail: jgracia@itaca.upv.es; check matrix of an ECC in an automatic and efficient way,
ljsaiz@itaca.upv.es; dgil@itaca.upv.es; pgil@itaca.upv.es). just defining its error detection and/or correction capabilities.
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. This paper is organized as follows. Section II introduces
Digital Object Identifier 10.1109/TVLSI.2018.2837220 the design of ECCs. Section III summarizes the codes used in
1063-8210 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

2 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

this paper. Section IV describes the different results obtained


during the evaluation of the ECCs. Finally, Section V provides
the conclusion.
II. I NTRODUCTION TO THE D ESIGN
Fig. 1. Encoding, channel crossing, and decoding process.
OF E RROR C ORRECTION C ODES
A. Background on ECCs
for Space Applications In this paper, we have designed several new ECCs using the
Different ECCs have been traditionally applied to space FUEC methodology [31]. The main characteristic of these new
missions [20]. For instance, Berger code [22] or the well- codes is their low redundancy. In order to check the behavior
known parity code has been used for detection purposes. of our codes, we have compared them with two types of
On the other hand, when error correction is codes. On the one hand, with matrix codes based on Hamming
needed, more complex codes can be used, such as codes, as these last codes present a good relationship between
Hamming [12], [20], Hadamard [23], Repetition [24], redundancy and area, power, and delay overheads. On the
Golay [25], Bose–Chaudhuri–Hocquenghem (BCH) [24], other hand, with Hamming-based codes due to the very low
Reed–Solomon [24], Reed–Muller [21], multidimen- redundancy of these codes.
sional [26], or Matrix [17] codes.
Hamming codes [12] can be easily built for any word B. Basics on Coding Theory
length. Also, the encoding and decoding circuits are easy to An (n, k) binary ECC encodes a k-bit input word in an n-bit
implement. Their main drawback is that only one bit in error output word [29]. The input word u = (u 0 , u 1 , . . . , u k−1 ) is
can be corrected. Nevertheless, for common data word lengths a k-bit vector which represents the original data. The code
(8, 16, 32, and 64), Hamming codes can detect some double- word b = (b0 , b1 , . . . , bn−1 ) is a vector of n bits, where
error patterns, in addition to the SEC. Exploiting this feature, the (n−k) redundant bits added are called parity or code
it is possible to systematize the detection of 2-bit adjacent bits. b is transmitted across an unreliable channel which
errors with the same redundancy, as presented in [27] and [28]. delivers the received word r = (r0 , r1 , . . . , rn−1 ). The error
In these works, different ECCs based on Hamming codes vector e = (e0 , e1 , . . . , en−1 ) models the error induced by
are introduced. These ECCs allow the correction of single- the channel. If no error has occurred in the i th bit, ei = 0;
bit errors or the detection of 2-bit adjacent errors with the otherwise, ei = 1. In this way, r can be interpreted as
same redundancy. r = b ⊕ e. Fig. 1 synthesizes this encoding, channel crossing,
The main problem of Hadamard and Repetition and decoding process.
codes [23], [24] is that they introduce a great redundancy The parity-check matrix H(n−k)×n of a linear block code
for common data word lengths [20]. This great redundancy defines the code [11]. For the encoding process, b must
provokes the necessity of a great memory storage capacity, accomplish the requirement H · bT = 0. For syndrome
which is an inconvenient for space applications. decoding, the syndrome is defined as sT = H · rT , and it
Golay code [25] is able to correct up to 3-bit errors. Nev- exclusively depends on e
ertheless, Golay code presents a redundancy of almost 100%.
Also, this code presents a high time and power consuming sT = H · rT = H · (b ⊕ e)T = H · bT ⊕ H · eT = H · eT .
ratio, as it has to execute sequentially two complementary (1)
sequences.
Although BCH and Reed–Solomon codes [24], [38] can There must be a different s for each correctable e. If s = 0,
correct multiple errors, their main drawbacks are the great we can assume that e = 0. Therefore, r is correct. Otherwise,
complexity and difficulty to implement them, as well as an error has occurred. Syndrome decoding is performed by
their great latency and speed. These weaknesses can be very addressing a lookup table that relates each s with the decoded
problematic in space applications. error vector ê. The decoded code word b̂ is calculated as
Concerning Reed–Muller codes [21], although vastly used b̂ = r ⊕ ê. From b̂, it is easy to obtain û just discarding
in critical applications, they present a great complexity. In this the parity bits. If the fault hypothesis employed to design the
way, the overheads introduced are higher than the overheads ECC is consistent with the channel behavior, û and u must be
introduced by Matrix or CLC codes, as shown in [17] and [19]. equal with a very high probability.
Multidimensional codes [26] are a class of matrix codes
that uses parity bits to detect and correct errors. With a low C. Error Models
redundancy, these codes present several drawbacks. When In coding theory [11], the term random error commonly
more than two errors must be corrected, the code design is refers to one or more bits in error, distributed randomly in the
very complicated. Also, it is very difficult to adapt these codes encoded word (data bits plus code bits generated by the ECC).
to standard data word sizes (i.e., 16, 32, or 64 bits). Random errors can be single (only one bit affected) or multi-
A better alternative are the Matrix codes based on Hamming ple. Single errors are the simplest ones, as they only affect a
codes [17], [19]. These codes still present a great redundancy, single memory cell. They are commonly produced by SEUs
but they are more cost effective than previous multiple-error- (in random access memories) or single-event transients (in
correcting codes. combinational logic) [32].
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

GRACIA-MORÁN et al.: IMPROVING ECCs FOR MCUs IN SPACE APPLICATIONS 3

TABLE I TABLE III


E NCODING F ORMULAS FOR THE H AMMING (7, 4) C ODE E XTENDED H AMMING E RROR D ETECTION /C ORRECTION

TABLE II
S YNDROME B ITS FOR THE H AMMING (7, 4) C ODE

Hamming codes can be extended to correct single errors


and detect double random errors. These codes are known as
SEC–DED extended Hamming codes [12]. These codes need
an additional parity bit to achieve the double-error detection.
As it was commented in Section I, with the continuous It is calculated as the even parity for the whole encoded word.
increasing of the integration scale, multiple errors are becom- In this way, and just adding an extra bit b7 , the Hamming
ing more frequent [4]–[8]. Multiple errors mainly manifest as SEC code (7, 4) shown previously converts into an extended
bursts [34]. We can define a burst error as a multiple error Hamming SEC–DED code (8, 4). b7 can be obtained as
that spans l bits in a word [11], i.e., a group of contiguous follows:
bits where, at least, the first and the last bits are in error. b7 = b0 ⊕ b1 ⊕ b2 ⊕ b3 ⊕ b4 ⊕ b5 ⊕ b6 . (3)
The separation l is known as burst length. Notice that adjacent
errors are a particular type of burst errors where all the The decoding process checks two conditions: 1) the parity
erroneous bits are contiguous. The main physical causes of a of the whole received word and 2) the syndrome bits, which
burst error in the context of RAM memories are diverse: high are calculated as in the Hamming code. Table III shows the
energy cosmic particles that hit some neighbor cells, crosstalk possible results, the corresponding meaning and the actions to
between adjacent cells, etc. [1], [33]. be taken. As it can be seen, single-bit errors can be corrected
as in the Hamming code. In the case of a double-bit error,
a nonrecoverable error (NRE) is detected but it cannot be
D. Hamming Codes
corrected.
Hamming codes [12] are able to correct single-bit errors There exist also (13, 8), (22, 16), (39, 32), and (72, 64)
with the lowest redundancy. For example, the parity-check SEC–DED extended Hamming codes. As in the SEC codes,
matrix for the Hamming (7, 4), i.e., n = 7 and k = 4, is shown redundancy decreases with higher data word lengths.
in the following equation:
⎡ ⎤
1010101 E. Flexible Unequal Error Control Methodology
H = ⎣ 0110011 ⎦. (2)
The methodology employed to design the error control
0001111
codes introduced in this paper can be found in [31]. This
From (2), it is easy to deduce the encoding and decoding methodology was developed to obtain FUEC codes. However,
operations. The encoding formulas are shown in Table I. it can be generalized to find any kind of codes. Although a
In the same way, it is also possible to obtain the syndrome detailed explanation is out of the scope of this paper, it is
decoding formulas from the parity-check matrix (2). Table II briefly summarized in the following. This methodology is
shows the expressions obtained to calculate the syndrome bits based on formulating the problem as a Boolean Satisfiability
for the Hamming (7, 4) code. problem. An algorithm developed by the authors is employed
If an error occurs, the syndrome bits will locate the erro- to solve it and to obtain a parity-check matrix, which defines
neous bit. A lookup table (implemented, for example, using the code to be designed.
a binary decoder) selects the erroneous bit. Applying the After defining the values of n and k for the code, the first
“exclusive-or” operation, the output of the lookup table correct step is the selection of error patterns to be corrected and
the erroneous bit. detected. For instance, single errors are represented with error
For common word lengths, such as 8, 16, 32, and 64 bits, vectors (. . . 1 . . .), and error vectors for double random errors
there exist (12, 8), (21, 16), (38, 32), and (71, 64) Hamming show the pattern (. . . 1 . . . 1 . . .), where 1’s represent the bits
codes, respectively. As it can be seen, redundancy decreases in error, and the dots represent the correct bits.
with longer data words. For instance, the (12, 8) Hamming The next step is to find the parity-check matrix H that
code presents a 50% of redundancy, whereas the (71, 64) satisfies the conditions (4) and (5), where E + represents the
Hamming code introduces about 11% of redundancy. set of error vectors to be corrected, and E  is the set of error
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

Fig. 2. Layout of a 16-bit data word for the (32, 16) Matrix code [18].
Fig. 3. Layout of a 16-bit data word for the (40, 16) CLC [19].

vectors to be detected. That is, each correctable error must


generate a different syndrome
Fig. 4. Layout of a 16-bit data word for the (21, 16) SEC–DAED [27].
H · eiT = H · eTj ; ∀ei , e j ∈ E + |ei = e j . (4)
In addition, each detectable error must generate a syndrome
which is different to all the syndromes generated by the
correctable errors Fig. 5. Layout of a 16-bit data word for the (21, 16) SEC–DAED [28].

H · eiT = H · eTj ; ∀ei ∈ E  , e j ∈ E + . (5) code implemented in this paper presents better correction and
However, several detectable errors may have the same detection performance than an extended Hamming code, as it
syndrome. is able to correct all single errors and to correct or to detect
To find the matrix, a recursive backtracking algorithm is all 2-bit burst errors. Nevertheless, this Matrix code presents
used. It checks partial matrices and adds a new column only a higher redundancy than an extended Hamming code. In this
if the previous matrix satisfies the requirements. In this way, way, memory required for code bits is increased, and also,
the algorithm starts with an empty partial_H matrix. New area, energy, and delay overheads.
columns, with n–k rows, are added, and the new partial Recently presented, CLC code [19], [30] is another matrix
matrices are checked recursively. The added columns must be code proposed to be used in space applications. The layout we
nonzero, so there are 2n−k –1 combinations for each column. have employed for the implementation of this code is shown
The complete execution of the algorithm is commonly in Fig. 3 (extracted from [19]), where X i is the primary data
unfeasible. Nevertheless, the first solutions are usually found input, divided into groups of 4 bits. Each group is codified by
quickly, if the code exists. Once selected the H matrix, it is an SEC–DED (8, 4) extended Hamming code (Ci and Pai ).
easy to determine the logic equations to calculate each parity Finally, a set of vertical parity bits (Pi ) form the matrix.
and syndrome bit, as well as the syndrome lookup table. They As just commented, and unlike the Matrix code, CLC uses
are required for the encoder and decoder implementation. an Extended Hamming code, allowing the correction of all
In addition, we can apply two different optimization criteria. single and 2-bit burst errors. Nevertheless, CLC introduces a
If we want to decrease the delay of the encoders and decoders, higher number of redundant bits, provoking a greater area,
we have to reduce the number of 1’s in those rows with the power, and delay overheads, and reducing the available mem-
highest number of 1’s of the parity-check matrix. In the case ory for the payload.
of area reduction, the total number of 1’s in the parity-check On the other hand, Saiz-Adalid et al. [27] introduce an
matrix must be reduced. SEC–double-adjacent error detection (SEC–DAED) code with
A detailed explanation of this algorithm, as well as a code a very low redundancy. This code is able to correct all single
design example, can be found in [31]. errors or to detect all double adjacent errors in a 16-bit data
word with only five redundant bits. Fig. 4 shows the data word
III. E RROR C ORRECTION C ODES D ESCRIPTION layout of this code, where Ci are the code bits, and X i are the
A. Previous Proposals data bits.
As commented previously, Matrix and CLC codes have been On the contrary, Sanchez-Macian et al. [28] present a
designed to tolerate MCUs [17], [19], a critical concern in different approach to generate Hamming-based ECCs. In this
space applications. case, we have selected an SEC–DAED code with the same
Combining Hamming codes and parity checks [17], [18], coverage characteristics, as well as the same redundancy of
Matrix codes form a 2-D scheme for correcting and detecting the SEC–DAED code from [27]. The main difference is the
some patterns of MCUs. For instance, in this paper, we have code layout, as shown in Fig. 5.
used the bit layout shown in Fig. 2 (extracted from [18]), To obtain the value of the different Ci bits, the parity-check
where X i are the data bits, Ci are the horizontal check bits matrix of these two SEC–DAED codes can be seen in [27]
(calculated as a Hamming code), and Pi are the column parity and [28], respectively.
bits (even parity).
The basic behavior of this Matrix code is as follows. The B. Our Approach
primary data input (X i ) is divided into groups of several bits. By using the FUEC methodology [31], we have been able
In this paper, this division is in groups of 4 bits. Each group to design several codes that improve the coverage and/or
is codified by a (7, 4) Hamming code (Ci ). Last, a set of the redundancy of the different codes presented previously
vertical parity bits (Pi ) completes the matrix. The Matrix (Matrix, CLC, and both SEC–DAED codes). The layout of
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

GRACIA-MORÁN et al.: IMPROVING ECCs FOR MCUs IN SPACE APPLICATIONS 5

Fig. 6. Layout of a 16-bit data word for the FUEC codes.

Fig. 8. Parity-check matrix H for the (24, 16) FUEC–TAEC code.

Fig. 7. Parity-check matrix H for the (23, 16) FUEC–DAEC code.

our codes is presented in Fig. 6, where Ci are the code bits


and X i are the data bits.
Using our algorithm, we have found a code [we will call
it FUEC–double adjacent error correction (DAEC)] that can
correct an error in a single bit, or an error in two adjacent
bits, or it can detect one 3-bit burst error or one 4-bit burst
error.
FUEC–DAEC needs only seven code bits for a 16-bit data Fig. 9. Parity-check matrix H for the (25, 16) FUEC–QUAEC code.
word. Fig. 7 shows the parity-check matrix H for this code,
where Ci are the code bits and X i are the primary data bits. Finally, FUEC–quadruple adjacent error correction
Once H has been obtained, it is very easy to design (QUAEC) is the last code we have designed. This code is
the encoder/decoder circuitry. For example, the formulas to able to correct an error in a single bit, or an error in two
calculate the code bits for the FUEC–DAEC code are adjacent bits (2-bit burst errors) or a 3-bit burst error or a
4-bit burst error. This can be done by using only nine code
C0 = X 0 ⊕ X 4 ⊕ X 7 ⊕ X 8 ⊕ X 11 ⊕ X 12 ⊕ X 13 bits. The parity-check matrix H for this code is shown in
C1 = X 1 ⊕ X 3 ⊕ X 5 ⊕ X 7 ⊕ X 9 ⊕ X 10 ⊕ X 11 ⊕ X 14 Fig. 9. As in the previous codes, Ci are the code bits and X i
C2 = X 0 ⊕ X 2 ⊕ X 6 ⊕ X 7 ⊕ X 9 ⊕ X 12 ⊕ X 14 ⊕ X 15 are the primary data bits.
We have to remark that we have generated a parity-check
C3 = X 1 ⊕ X 4 ⊕ X 8 ⊕ X 9 ⊕ X 12
matrix optimized to achieve the lowest delay for our three
C4 = X 0 ⊕ X 3 ⊕ X 4 ⊕ X 7 ⊕ X 12 ⊕ X 13 ⊕ X 15 codes, that is, with a reduced number of 1’s in the rows with
C5 = X 1 ⊕ X 2 ⊕ X 5 ⊕ X 9 ⊕ X 10 ⊕ X 12 ⊕ X 14 the highest number of 1’s.
C6 = X 2 ⊕ X 3 ⊕ X 6 ⊕ X 8 ⊕ X 10 ⊕ X 11 ⊕ X 13 ⊕ X 15 . (6) It is also remarkable that the names FUEC-∗AEC only
indicate that they have been designed using the FUEC method-
On the other hand, the syndrome formulas that can be ology. The codes presented here must not been confused
obtained from H to check if an error has occurred are with the FUEC codes from [31], an improvement of UEC
codes [11].
S0 = C0 ⊕ X 0 ⊕ X 4 ⊕ X 7 ⊕ X 8 ⊕ X 11 ⊕ X 12 ⊕ X 13
With respect to the code bits needed, our three codes present
S1 = C1 ⊕ X 1 ⊕ X 3 ⊕ X 5 ⊕ X 7 ⊕ X 9 ⊕ X 10 ⊕ X 11 ⊕ X 14 a very low redundancy with respect to the Matrix and CLC
S2 = C2 ⊕ X 0 ⊕ X 2 ⊕ X 6 ⊕ X 7 ⊕ X 9 ⊕ X 12 ⊕ X 14 ⊕ X 15 codes. Nevertheless, the lowest redundancy corresponds to
S3 = C3 ⊕ X 1 ⊕ X 4 ⊕ X 8 ⊕ X 9 ⊕ X 12 both SEC–DAED codes, but this low redundancy only permits
the correction of single errors, as can be seen from Table IV.
S4 = C4 ⊕ X 0 ⊕ X 3 ⊕ X 4 ⊕ X 7 ⊕ X 12 ⊕ X 13 ⊕ X 15
Table IV shows the number of code bits introduced by
S5 = C5 ⊕ X 1 ⊕ X 2 ⊕ X 5 ⊕ X 9 ⊕ X 10 ⊕ X 12 ⊕ X 14 each ECC, as well as the redundancy introduced with respect
S6 = C6 ⊕ X 2 ⊕ X 3 ⊕ X 6 ⊕ X 8 ⊕ X 10 ⊕ X 11 ⊕ X 13 ⊕ X 15 . to a 16-bit data word. Notice that although in this paper we
(7) have worked with 16-bit data word, FUEC methodology and
algorithm can be applied to longer codeword sizes. Calculus
Our second proposal, called FUEC–triple adjacent error of the redundancy has been done with
correction (TAEC), is able to correct an error in a single
No. code bits
bit, or an error in two adjacent bits (2-bit burst errors) or a Redundancy = × 100. (8)
3-bit burst error, or it can detect a 4-bit burst error. This is No. data bits
possible by adding one more code bit. In this case, for a The importance of a low redundancy comes from the fact
16-bit data word, the FUEC–TAEC code needs eight code bits. that these extra bits must be also stored in memory in order
The parity-check matrix H for this code is presented in Fig. 8. to check if an error has occurred. In this way, a higher
As in the case of the FUEC–DAEC, Ci are the code bits and redundancy means a lower storage available for data bits.
X i are the primary data bits. Similarly, from H it is very easy As an example, if we use a 1-GB memory chip, only 512 MB
to design the encoder/decoder circuitry. are available to store data bits in the case of the Matrix code,
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

6 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

TABLE IV
N UMBER OF C ODE B ITS FOR A 16-bit D ATA W ORD

requiring the remaining 512 MB to store the code bits. In the


case of the CLC code, only about 410 MB are available
to store data bits. On the other hand, in the case of the
SEC–DAED codes, the available memory for data is about
780 MB.
Following with this example, our FUEC–DAEC code
allows storing about 712 MB of primary data. In contrast,
our FUEC–TAEC and FUEC–QUAEC codes allow storing
682 and 655 MB, respectively. As it can be seen, the increment Fig. 10. Block diagram of the fault injector simulator.
in the storage available is very significant, taking into account
the improvement in the coverage properties of our three codes. process for all errors of a given size and model (i.e., random
Last column of Table IV shows the burst error coverage or burst), it is possible to count the number of corrected
of the different ECCs, a concern to have into account in and/or detected errors with respect to the total number of
space applications [9], [10]. As shown in [9], MCUs mainly possible errors, that is, it is possible to calculate the coverage
provoke 2-bit adjacent errors in earth observation satellites; of each ECC. In this paper, we have injected single errors,
although a nonnegligible percentage of longer burst errors are as well as burst errors with a burst length varying from 2 to 8.
also presented. In deep space exploration, a higher impact of This is a representative range in space applications [6], [7].
longer MCUs is expected. Notice that burst errors can be adjacent or not, and affect to
the layout of the rows. In the case of the Matrix code, we have
IV. E RROR C ORRECTION C ODES E VALUATION considered that C3 is adjacent to X 5 , C6 to X 9 , etc. (see Fig. 2).
In this section, we present the different results obtained In the same way, for the CLC code, Pa1 is adjacent to X 5 ,
during the evaluation of the ECCs presented in Section III. Pa2 to X 9 , etc. (see Fig. 3).
We have carried out two different processes. During the We have to remark that we have not injected errors accord-
first one, we have injected faults in C models of the ECCs ing to their probability of occurrence, as our goal is to measure
for error coverage evaluation. In a second step, we have the correction and detection coverages that represent percent-
implemented the different ECCs in very high speed integrated ages. Specifically, we have injected each type of error (single
circuit hardware description language (VHDL) and we have errors or burst errors of different lengths) in all bits of the
synthesized them, in order to estimate area, power, and delay codeword to verify the error correction/detection capabilities
overheads. This section finishes with the analysis of the results of the different ECCs. Obviously, burst errors of length 8
obtained. will be much less frequent than burst errors of length 2,
as bibliography shows [6], [7].
A. Error Coverage Evaluation All blocks of the fault injection tool have been devel-
In order to study the error coverage of the different ECCs, oped in C, using the bitwise logic operators for an accurate
we have developed a simulator that allows injecting different simulation of the hardware behavior. Encoder and decoder
types of error. The basic scheme is shown in Fig. 10. circuits can be easily obtained from the parity-check matrix H,
This tool allows the injection of different error types. as stated in Section III. These circuits are implemented in C
By comparing the input and output words, the simulator can as encoding and decoding functions. Changing the simulator
check if the error injected leads to a right or wrong decoding. for a different ECC is as simple as adjusting the word lengths
Also, the decoder circuit can activate the NRE signal when and replacing the encoding and decoding functions for the
an error is detected but it cannot be corrected. Repeating the new ECC.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

GRACIA-MORÁN et al.: IMPROVING ECCs FOR MCUs IN SPACE APPLICATIONS 7

Fig. 11. Correction coverage for burst errors. Fig. 12. Detection coverage for burst errors.

TABLE V
A REA O CCUPIED BY D IFFERENT ECC S ( IN nm2 )
Fig. 11 shows the results obtained for the correction cover-
age of each code. This coverage has been calculated as
Errors_Corrected
Ccorrec = × 100 (9)
Errors_Injected
where Errors_Corrected are the number of errors corrected by
the ECC, and Errors_Injected are the total amount of errors
injected for a given burst length.
As expected, FUEC–TAEC and FUEC–QUAEC codes
present better correction coverage than Matrix, CLC, and
both SEC–DAED codes, as they are able to correct up to
In conclusion, our codes are very efficient to tolerate burst
3- and 4-bit burst errors respectively. On the other hand, our
errors of from 2- to 4-bit length. Beyond 4-bit burst errors,
FUEC–DAEC code can correct up to 2-bit burst errors, such
the performance of our codes decreases notably due to their
as CLC.
low redundancy. If these errors are expected to occur, more
Nevertheless, for longer burst errors (5 bits or more), the
powerful codes (with a higher redundancy) must be employed.
correction capabilities of our codes degrade more deeply
Nevertheless, the probability of occurrence of burst errors
than Matrix and CLC ones. This is provoked by the lower
decreases significantly when increasing its length [6], [7].
redundancy of our codes, which causes a lower number of
available syndromes to be used to correct longer burst errors.
In other words, our codes employ the available syndromes B. Synthesis Results
in a more efficient way inside the expected error rank, but We have shown in Section III that our three codes,
degrading quickly outside this rank. This effect can be seen FUEC–DAEC, FUEC–TAEC, and FUEC–QUAEC, present a
also in both SEC–DAED codes. In fact, these codes present lower redundancy with respect to Matrix and CLC codes. This
the greatest degradation from multiple-bit errors (2-bit burst lower redundancy provokes a lower storage requirement for
errors or longer). As it happens with our codes, the very low code bits.
redundancy of both SEC–DAED codes provokes this result. Nevertheless, the question that arises now is that this lower
Finally, Fig. 12 shows the detection coverage for all codes. redundancy would translate into an improvement on the area,
This detection coverage is calculated as power, and delay overheads in the encoder and decoder circuits
with respect to matrix, CLC, and both SEC–DAED codes.
Errors_Corrected + Errors_Detected
Cdetec = × 100 (10) Although the area overhead of the encoders and decoders may
Errors_Injected be negligible in comparison with memory overhead, power and
where Errors_Detected corresponds to the number of errors delay overheads can be important, especially in deep space
not corrected but detected by the ECCs. systems.
As can be seen from Fig. 12, all our codes present a 100% To solve this question, we have synthesized the encoder and
detection of up to 4-bit burst errors, improving the behavior decoder circuits for all ECCs. To do this, we have implemented
of Matrix, CLC, and both SEC–DAED codes. them in VHDL, and using CADENCE software [35], we have
As in the correction coverage, the percentage of detected carried out a logic synthesis for 45-nm technology by using
errors in our FUEC–TAEC and FUEC–QUAEC codes the NanGate FreePDK45 Open Cell Library [36], [37].
degrades sharply for longer bursts. By contrast, FUEC–DAEC Table V (and Fig. 13) shows the area occupied by the
maintains a coverage over 60%, near the Matrix detection different circuits in nm2 (1 nm = 10−9 m). As it can be
capability. seen, our encoders present a slightly greater area than the
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

8 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

Fig. 13. Area occupied by the different ECCs (in nm2 ). Fig. 14. Power consumption (in μW).

TABLE VI
TABLE VII
P OWER C ONSUMPTION ( IN μW)
D ELAY OVERHEAD FOR D IFFERENT ECC S ( IN ps)

encoders of both SEC–DAED codes. The worst area numbers


for the encoders correspond to the CLC and Matrix codes. code presents a slightly higher power consumption than
These results are provoked by their higher redundancy, which both SEC–DAED codes, and much better numbers than
provokes complex encoders’ circuitry. Matrix and CLC codes. The worst score corresponds to the
Regarding decoders’ area, the SEC–DAED ones present the FUEC–QUAEC code. It is noticeable the low power consump-
lowest numbers. On the other hand, decoders for FUEC–TAEC tion of the FUEC–TAEC code with regard to its error coverage.
and FUEC–QUAEC codes present the biggest area overhead, Particularly, we can observe that power consumption of the
as they are able to correct and/or detect more errors. encoders presents a similar trend with respect to area overhead,
In general, both SEC–DAED codes present the lowest area that is, CLC codes present the higher numbers, while the rest
overhead. Nevertheless, their coverage capabilities are very of codes show similar numbers. On the other hand, as our
simple with respect to the other codes. Comparing codes with three codes present a better error coverage, the decoders power
similar coverage capabilities (i.e., correction of 2-bit burst overhead also increase.
errors done by Matrix, CLC, and FUEC–DAEC codes), the It is also interesting to remember that power consump-
FUEC–DAEC circuitry occupies the smallest area. In addi- tion of memory has not been taken into account in these
tion, its redundancy is much smaller (less than a half). This results. As our codes present a very low redundancy, they
low redundancy reduces also the silicon area needed by the need a lower storage capacity. In this way, our codes
memory to store the code bits. would imply an additional power reduction of the redundant
With respect to the FUEC–QUAEC code, it presents higher memory.
error coverage, and therefore, it introduces a higher hardware Finally, Table VII (and Fig. 15) shows the delay introduced
cost and a larger area, mainly in the decoders. Nevertheless, by the different codes (in ps, 1 ps = 10−12 s). We do not
as stated above, its low redundancy reduces the area needed represent the total delay because encoders and decoders work
by the memory. On the other hand, the area occupied by the in independent operations (writing and reading).
FUEC–TAEC code is lower than the area overhead of CLC, With respect to the encoders delay, CLC presents the highest
with better error coverage. delay, while Matrix shows the lowest delay. With regard to
Table VI (and Fig. 14) shows the power (static and our encoders, they present an intermediate delay, except the
dynamic) consumption overhead of the different ECCs FUEC–DAEC encoder, that introduces a delay slightly lower
(in μW, 1 μW = 10−6 W). In global terms, both SEC–DAED than the CLC one. It is noticeable the low delay introduced
codes present the lowest power consumption (encoder plus by the encoder of the FUEC–QUAEC code. Although this
decoder). With respect to our three codes, our FUEC–DAEC code presents the highest redundancy of our three codes, its
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

GRACIA-MORÁN et al.: IMPROVING ECCs FOR MCUs IN SPACE APPLICATIONS 9

Fig. 15. Delay overhead (in ps).


Fig. 16. Global evaluation applying M metric.

In any case, when multiple errors are present, our three


parity-check matrix presents a more balanced number of 1’s
codes present a best M score than Matrix and CLC codes.
in each row, reducing in this way the delay needed.
We can observe also that CLC performance is the worst from
In the case of the decoders, our FUEC–DAEC code presents
single errors up to 4-bit burst errors length. Due to its great
the fastest correction, while FUEC–TAEC and FUEC–QUAEC
redundancy, M metric for CLC is greatly penalized.
codes present the highest delay, but also the highest error
To sum up, our three codes (FUEC–DAEC, FUEC–TAEC,
coverage. The rest of the codes show similar results.
and FUEC–QUAEC) present the best M scores for 2-, 3-, and
4-bit burst errors. On the other hand, CLC code presents the
C. Global Evaluation of ECCs M Metric worst M numbers for these burst lengths. In this way, our three
With the results obtained in previous sections, it is not easy codes seems a good choice when multiple errors are present,
to provide a global evaluation of the ECCs. To solve this as it is expected in spatial applications.
question, several metrics have been proposed [17]–[19]. For
instance, the TCC metric [19] can be used to tradeoff area, V. C ONCLUSION
power, delay, and error coverage. We have modified this metric
to add the contribution of an important factor: the redundancy. In this paper, a series of new ECCs has been presented.
We have included this parameter because the storage size These new ECCs improve the behavior of the well-known
for the different ECCs is reduced with a lower redundancy. Matrix code, the recently introduced CLC code and different
Indirectly, area and power overheads can be affected. In this SEC–DAED codes.
way, we have presented a generic metric with a very simple A characteristic of our codes is the low redundancy they
expression. Thus, if we want to enhance a parameter in a introduce. This fact provokes a reduction in the storage needed
specific application, it is possible to weigh any of them with for the code bits. In addition, the detection and correction
different weights; or we can include some limitations, such as capabilities are maintained, or even increased.
reaching a certain correction or detection level. In this way, The insertion of an ECC in memory also provokes the
by including the redundancy, we can expect a metric more introduction of area, power, and delay overheads. Relating
complete and accurate. The new metric, that we have called M, to the area overhead, we have seen that FUEC–DAEC code
is presented in the following equation: introduces a much lower area overhead than the CLC and
Matrix codes, while FUEC–TAEC code presents a similar
Ccorrec × Cdetec
M= . (11) area overhead than Matrix and CLC codes. As expected,
Area × Power × Delay × Redundancy FUEC–QUAEC code exhibits the highest overhead, related to
With respect to the delay factor, we have analyzed the its high correction and detection capabilities. Also, we have
longest delay (the worst case), that corresponds to the decoder to take into account the reduction of memory area overhead
part. In the case of area and power, we have used the total due to the low redundancy of our codes.
values (encoder plus decoder). Concerning the power overhead, the trend is similar to
As can be seen from Fig. 16, both SEC–DAED codes the area overhead. FUEC–DAEC code introduces a much
present the best M score for single errors. This is an expected lower power overhead than the CLC and Matrix codes. Even,
result due to the simplicity of their encoders and decoders the power overhead of the FUEC–DAEC code is similar to the
and their low redundancy. When multiple errors are present, SEC–DAED codes ones. FUEC–TAEC and FUEC–QUAEC
the FUEC–DAEC code presents the best M score for 2-, 3-, codes present a power consumption similar or a little bigger
and 4-bit burst error length. In this way, FUEC–DAEC code than CLC code, but with much better error correction and
seems the most adequate for short burst errors, as it can correct detection capabilities. And also, we have to take into account
all single and 2-bit burst errors, as well as to detect all 3- and the reduction of memory power consumption due to the low
4-bit burst errors. redundancy of our codes.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

10 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

With respect to the delay, FUEC–DAEC presents the fastest [15] S. Pontarelli, G. C. Cardarilli, M. Re, and A. Salsano, “Error correction
correction, even better than both SEC–DAED codes. On the codes for SEU and SEFI tolerant memory systems,” in Proc. 24th
IEEE Int. Symp. Defect Fault Tolerance VLSI Syst. (DFT), Oct. 2009,
contrary, FUEC–TAEC and FUEC–QUAEC codes introduce pp. 425–430.
the highest correction delay, because they are designed to [16] A. Sánchez-Macián, P. Reviriego, J. Tabero, A. Regadío, and
correct longer burst errors. In this way, decoder circuits are J. A. Maestro, “SEFI protection for nanosat 16-bit chip onboard
computer memories,” IEEE Trans. Device Mater. Rel., vol. 17, no. 4,
more complex. Nevertheless, encoder circuits are quite quick. pp. 698–707, Dec. 2017, doi: 10.1109/TDMR.2017.2750718.
In general, and according to the M metric, introduced [17] C. Argyrides, D. K. Pradhan, and T. Kocak, “Matrix codes
to evaluate the overall features of the analyzed codes, our for reliable and cost efficient memory chips,” IEEE Trans. Very
Large Scale Integr. (VLSI) Syst., vol. 19, no. 3, pp. 420–428,
FUEC–DAEC code presents the best performance to correct Mar. 2011.
single or 2-bit burst errors, or to detect 3- or 4-bit burst [18] C. Argyrides, H. R. Zarandi, and D. K. Pradhan, “Matrix codes:
errors. In addition, FUEC–TAEC and FUEC–QUAEC have Multiple bit upsets tolerant method for SRAM memories,” in Proc.
22nd IEEE Int. Symp. Defect Fault Tolerance VLSI Syst., Sep. 2007,
demonstrated good features to correct 3-bit and 4-bit burst pp. 340–348.
errors, respectively. Thus, our codes become an appropriate [19] H. S. de Castro, J. A. N. da Silveira, A. A. P. Coelho, F. G. A. e Silva,
option for critical applications in embedded systems. Beyond P. D. S. Magalhães, and O. A. de Lima, “A correction code for multiple
cells upsets in memory devices for space applications,” in Proc. 14th
4-bit burst errors, the performance of our codes decreases IEEE Int. New Circuits Syst. Conf. (NEWCAS), Jun. 2016, pp. 1–4.
notably due to their low redundancy. If these errors are [20] S. Ahmad, M. Zahra, S. Z. Farooq, and A. Zafar, “Comparison of EDAC
expected to occur, more powerful ECCs must be employed. schemes for DDR memory in space applications,” in Proc. Int. Conf.
Aerosp. Sci. Eng. (ICASE), Aug. 2013, pp. 1–5.
In a future work, we want to continue developing ECCs, [21] D. E. Müller, “Application of Boolean algebra to switching circuit design
decreasing area, power, and delay overheads while maintain- and to error detection,” IRE Trans. Electron. Comput., vol. 3, no. 3,
ing, or even increasing, the code coverage. We want focus on pp. 6–12, 1954.
[22] J. M. Berger, “A note on error detection codes for asymmetric channels,”
long burst errors, which are expected to have more and more Inf. Control, vol. 4, pp. 68–73, Mar. 1961.
impact on space systems. [23] O. Goldreich and L. A. Levin, “A hard-core predicate for all one-
way functions,” in Proc. 21st ACM Symp. Theory Comput., 1989,
pp. 25–32.
R EFERENCES [24] M. Bossert, Channel Coding for Telecommunications. New York, NY,
USA: Wiley. 1999.
[1] (2013). The International Technology Roadmap for Semiconductors. [25] M. J. E. Golay, “Notes on digital coding,” Proc. IRE, vol. 37, p. 657,
[Online]. Available: http://www.itrs2.net/2013-itrs.html Jun. 1949.
[2] S. K. Kurinec and K. Iniewski, Nanoscale Semiconductor Memories: [26] N. B. Anne, U. Thirunavukkarasu, and S. Latifi, “Three and four-
Technology and Application. Boca Raton, FL, USA: CRC Press, 2014. dimensional parity-check codes for correction and detection of multiple
[3] J. Barak, M. Murat, and A. Akkerman, “SEU due to electrons in silicon errors,” in Proc. Int. Conf. Inf. Technol. Coding Comput. (ITCC),
devices with nanometric sensitive volumes and small critical charge,” Apr. 2004, pp. 837–842.
Nucl. Instrum. Methods Phys. Res. B, Beam Interact. Mater. At., vol. 287, [27] L. J. Saiz-Adalid, P. Gil, J.-C. Baraza-Calvo, J.-C. Ruiz,
pp. 113–119, Sep. 2012. D. Gil-Tomas, and J. Gracia-Moran, “Modified Hamming codes
[4] E. Ibe, H. Taniguchi, Y. Yahagi, K. Shimbo, and T. Toba, “Impact of to enhance short burst error detection in semiconductor memories,” in
scaling on neutron-induced soft error in SRAMs from a 250 nm to Proc. IEEE Eur. Dependable Comput. Conf. (EDCC), Newcastle, U.K.,
a 22 nm design rule,” IEEE Trans. Electron Devices, vol. 57, no. 7, May 2014, pp. 62–65.
pp. 1527–1538, Jul. 2010. [28] A. Sánchez-Macián, P. Reviriego, and J. A. Maestro, “Hamming SEC-
[5] G. Tsiligiannis et al., “Multiple cell upset classification in commercial DAED and extended Hamming SEC-DED-TAED codes through selec-
SRAMs,” IEEE Trans. Nucl. Sci., vol. 61, no. 4, pp. 1747–1754, tive shortening and bit placement,” IEEE Trans. Device Mater. Rel.,
Aug. 2014. vol. 14, no. 1, pp. 574–576, Mar. 2014.
[6] G. I. Zebrev, K. S. Zemtsov, R. G. Useinov, M. S. Gorbunov, [29] A. Neubauer, J. Freudenberger, and V. Kühn, Coding Theory: Algo-
V. V. Emeliyanov, and A. I. Ozerov, “Multiple cell upset cross-section rithms, Architectures and Applications. New York, NY, USA: Wiley,
uncertainty in nanoscale memories: Microdosimetric approach,” in Proc. 2007.
15th Eur. Conf. Radiat. Effects Compon. Syst. (RADECS), Sep. 2015, [30] F. Silva et al., “Evaluation of multiple bit upset tolerant codes for
pp. 1–5. NoCs buffering,” in Proc. IEEE 8th Latin Amer. Symp. Circuits
[7] N. Chechenin and M. Sajid, “Multiple cell upsets rate estimation for Syst. (LASCAS), Feb. 2017, pp. 1–4.
65 nm SRAM bit-cell in space radiation environment,” in Proc. 3rd Int. [31] L. J. Saiz-Adalid et al., “Flexible unequal error control codes with
Conf. Exhib. Satell. Space Missions, May 2017, p. 77. selectable error detection and correction levels,” in Proc. 32th Int. Conf.
[8] N. N. Mahatme, B. L. Bhuva, Y.-P. Fang, and A. S. Oates, “Impact of Comput. Safety, Rel. Secur. (SAFECOMP), Sep. 2013, pp. 178–189.
strained-Si PMOS transistors on SRAM soft error rates,” IEEE Trans. [32] K. A. LaBel. (Aug. 2009). Proton Single Event Effects (SEE) Guideline.
Nucl. Sci., vol. 59, no. 4, pp. 845–850, Aug. 2012. NASA Electronic Parts and Packaging (NEPP). [Online]. Available:
[9] Y. Bentoutou, “Program memories error detection and correction on- https://nepp.nasa.gov/files/18365/Proton_RHAGuide_NASAAug09.pdf
board earth observation satellites,” Int. J. Elect. Comput. Eng., vol. 4, [33] M. Greenberg. (2014). Reliability, Availability, and Service-
no. 6, pp. 933–936, 2010. ability (RAS) for DDR Dram Interfaces. [Online]. Available:
[10] M. J. Gadlage, A. H. Roach, A. R. Duncan, A. M. Williams, http://www.memcon.com/pdfs/proceedings2014/NET105.pdf
D. P. Bossev, and M. J. Kay, “ Multiple-cell upsets induced by sin- [34] S. Shamshiri and K.-T. Cheng, “Error-locality-aware linear coding to
gle high-energy electrons,” IEEE Trans. Nucl. Sci., vol. 65, no. 1, correct multi-bit upsets in SRAMs,” in Proc. IEEE Int. Test Conf. (ITC),
pp. 211–216, Jan. 2018, doi: 10.1109/TNS.2017.2756441. Nov. 2010, pp. 1–10.
[11] E. Fujiwara, Code Design for Dependable Systems: Theory and Prac- [35] Cadence: EDA Tools and IP for System Design Enablement. Accessed:
tical Applications. Hoboken, NJ, USA: Wiley, 2006. May 23, 2018. [Online]. Available: https://www.cadence.com
[12] R. W. Hamming, “Error detecting and error correcting codes,” Bell Syst. [36] J. E. Stine et al., “FreePDK: An open-source variation-aware design kit,”
Tech. J., vol. 29, no. 2, pp. 147–160, Apr. 1950. in Proc. IEEE Int. Conf. Microelectron. Syst. Educ. (MSE), Jun. 2007,
[13] C. L. Chen and M. Y. Hsiao, “Error-correcting codes for semiconductor pp. 173–174.
memory applications: A state-of-the-art review,” IBM J. Res. Develop., [37] NanGate FreePDK45 Open Cell Library. Accessed: May 28, 2018.
vol. 58, no. 2, pp. 124–134, Mar. 1984. [Online]. Available: http://www.nangate.com/?page_id=2325
[14] G. C. Cardarilli, M. Ottavi, S. Pontarelli, M. Re, and A. Salsano, “Fault [38] A. Mukati, “Review: A survey of memory error correcting techniques
tolerant solid state mass memory for space applications,” IEEE Trans. for improved reliability,” J. Netw. Comput. Appl., vol. 34, no. 2,
Aerosp. Electron. Syst., vol. 41, no. 4, pp. 1353–1372, Oct. 2005. pp. 517–522, Mar. 2011.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

GRACIA-MORÁN et al.: IMPROVING ECCs FOR MCUs IN SPACE APPLICATIONS 11

Joaquín Gracia-Morán received the B.Sc., M.Sc., Daniel Gil-Tomás received the B.Sc. degree in elec-
and Ph.D. degrees in computer engineering from the trical and electronic physics and the Ph.D. degree
Universitat Politècnica de València (UPV), Valencia, in computer engineering from the Universitat de
Spain, in 1995, 1997, and 2004, respectively. València (UPV), Valencia, Spain, in 1985 and 1999,
He is currently an Associate Professor at the respectively.
Department of Computer Engineering, UPV. He is He is currently an Associate Professor at the
a member of the Fault-Tolerant Systems (STF) Department of Computer Engineering, UPV. He is
research line within the Instituto ITACA. His current a member of STF-ITACA. His current research
research interests include the design and implemen- interests include the design and validation of fault-
tation of digital systems, the design and validation tolerant systems, reliability physics, and the reliabil-
of STF, and VHDL-based fault injection. ity of emerging nanotechnologies.

Pedro J. Gil-Vicente (M’93) has taught courses


on computer technology, digital design, computer
networks, and fault-tolerant systems. He is currently
a Professor at the Universitat Politècnica de Valèn-
cia, Valencia, Spain, where he has been the Head
of the Department of Computer Engineering. He is
also the Head of the Fault-Tolerant Systems research
Luis J. Saiz-Adalid received the M.Sc. and Ph.D. line, within ITACA. He has authored more than
degrees in computer engineering from the Universi- 100 research papers on these subjects. His current
tat Politècnica de València (UPV), Valencia, Spain, research interests include the design and valida-
in 1995 and 2015, respectively. tion of real-time fault-tolerant distributed systems,
He was with IBM Valencia, Spain, (1995- dependability validation using fault injection, the design and verification of
Celestica, 1998) for 15 years. He is currently an embedded systems, and dependability and security benchmarking.
Associate Professor at the Department of Computer Prof. Gil-Vicente has also served as a Program Committee member for
Engineering, UPV. He is also a member of STF- the IEEE International Conference on Dependable Systems and Networks,
ITACA. His current research interests include the the European Dependable Computing Conference (EDCC) and the Latin
design and implementation of digital systems, design American Symposium on Dependable Computing and as a reviewer for the
and validation of fault-tolerant systems, and design international magazines and congresses related to dependability and security.
of error correction codes. He was the General Chair of the EDCC-8 Conference, in Valencia, in 2010.

Potrebbero piacerti anche