A Chaos-Based Complex Micro-Instruction Set For Mitigating Instruction Reverse Engineering

Noname manuscript No.
(will be inserted by the editor)
A Chaos-based Complex Micro-Instruction Set for Mitigating

Instruction Reverse Engineering
Md Sakib Hasan∗1 · Md Badruddoja Majumder1 · Aysha S. Shanta1 ·
Garrett S. Rose1
Received: date / Accepted: date
Abstract Chaos computing provides a large number of Keywords chaos computing, side channel, power
functions from a single hardware. Large scale reconfig- profile, obfuscation, instruction classification, hardware
urability can be achieved flexibly by tuning only a few security
parameters from a chaos based computing system. Im-
plementation of reconfigurable complex functions from
a single chaos circuit can alleviate area and power con- 1 Introduction
cerns due to decreasing technology nodes. It is possible
to make a multi-input multi-output complex instruc- Since Lorenz’s discovery of chaotic motion on a strange
tion set using the chaos generated functionalities where attractor in 1963 [27], chaos has attracted a lot of atten-
operations are more uniform than conventional imple- tion in areas such as chemistry, physics, biology, ecology
mentations. Lack of uniformity in implementation of and financial systems [35]. Many natural and engineer-
instructions in traditional computing system provides ing systems have been modeled using dynamical sys-
opportunity for attackers to reverse engineer based on tems [10, 26]. Over the years, non-linear dynamics in
side channel power analysis. In this paper, it is pro- chaotic systems has become an active field of research
posed that chaos based implementation of a complex due to advancements in chaotic neural networks and
instruction set is immune to classification based reverse chaos communications [1, 16, 25]. Applications of chaos
engineering attack. Cross obfuscation and self obfusca- are significant in the field of engineering, especially in
tion schemes are proposed in this work which leverage cryptography, secure communication, plasma technolo-
reconfigurability of chaotic system for obfuscating the gies and lasers [2, 13, 18].
power profile of the instruction set and it has been made Computing based on chaotic progression of state in
immune to reverse engineering attacks. The design uti- non-linear circuits, also known as chaos computing, has
lizes 3-input multi-output instructions by using a sin- been an exciting research area. Researchers have worked
gle chaotic iterative map. We analyzed the immunity on various aspects of chaos computing exploring recon-
of this design against classification based reverse engi- figurability, flexibility and security in the non-linear sys-
neering attack for six different classification algorithms tems [5, 7, 22]. Chaos based computation can facilitate
with five dimensionality reduction techniques. a single hardware to perform a large number of op-
erations by changing a small set of parameters in the
circuit topology. A single operation can be performed
∗
Md Sakib Hasan in many ways by changing the threshold voltage, con-
343 Min H. Kao Building trol bits, initial state, iteration number and bifurcation
1520 Middle Drive
Knoxville, TN 37996-2250 USA
parameter.
Tel.: +1-865-974-0229 Researchers have proposed different chaos based im-
E-mail: mhasan4@utk.edu plementations of basic logic gates. Higher input func-
1
Department of Electrical Engineering and Computer Sci- tions are possible to implement using chaos in addition
ence, The University of Tennessee, Knoxville, TN, USA E- to 2-input basic logic functions [21]. Chaos computing
mail: {mmajumde,ashanta1,garose}@utk.edu demonstrates promise in the field of secure and confi-
2 Md Sakib Hasan∗1 et al.
dential computing. It has been proposed as a means of linear dynamical systems are studied which can gener-
obfuscating power profile and hence mitigating power ate multiple logic functions using the same design.
analysis based side channel attack [28, 31]. In chaotic systems, the non-linear dynamics of CMOS
In this work, a new chaos based design for imple- circuits and their intrinsic computational capability is
menting a complex micro-instruction set comprising 3- being explored. The circuit maps the initial state of
input multi-output (1−8) digital operation is proposed. the circuit to future states. The dynamical system can
Since each instruction in traditional instruction set con- evolve in continuous time or in discrete time. Continu-
tains distinguishing power signatures, they can be clas- ous time chaotic systems have high complexity and low
sified with high accuracy using standard classification efficiency compared to discrete time chaotic systems.
algorithms. Training power signatures are collected from In order to design a discrete time chaotic map, the out-
a reference computing machine for performing instruc- put of the map is connected to the input of the circuit
tion reverse engineering in different machines. This work creating a feedback path [23].
shows that by leveraging different configurations of chaos A disadvantage of using chaotic logic gate is that
operation, each machine can perform instructions with they require more hardware compared to standard CMOS
unique power profile. Moreover, an uniform implemen- logic gates. In order to overcome this limitation, each
tation of the instruction set is also proposed where each chaotic gate should be able to generate increased func-
operation exhibits very similar power signatures and tionality. The number of functions that a single chaotic
can not be distinguished. It has been demonstrated that circuit can implement increases exponentially with the
both of these methods can help mitigate side channel number of iterations. The functions generated by the
reverse engineering attacks performed using different chaotic circuit can be dynamically chosen to implement
classification and dimension reduction algorithms. different logic functions in each clock cycle. The chaotic
The paper is organized as follows. Section 2 provides system is able to exhibit different behaviors by changing
background and necessary details about chaos comput- the initial state of the circuit or by changing the circuit
ing and its application to perform reconfigurable and parameters. In practice, all the functions may not be
flexible logic operations. Design and working principles accessible or usable due to noise or instability [22].
of a chaotic map circuit is described in Section 3. Sec- Chua’s circuit is one of the most popular chaotic
tion 4 describes the design of complex micro-instruction oscillators which uses an element called “Chua diode”
set. Section 5 goes through different classification algo- (piecewise-linear) to implement the non-linearity of dy-
rithms and dimensionality reduction techniques used to namical systems. Several approaches such as “cubic-
reverse engineer instruction using side channel power like” non-linearity or cubic non-linearity have been dis-
signatures. Classification results of CMOS based tradi- covered to replace the Chua diode [33]. Previously, arith-
tional implementation of the instruction set is provided metic operations were performed using arrays of chaotic
in Section 6. Both of the proposed defense models along elements. Chaotic systems are now used to implement
with their corresponding results are explained in Sec- logic gates which are capable of implementing AB, A +
tion 7 and 8, respectively. Section 9 discusses about lim- B, AB, A + B, A ⊕ B, A ⊕ B, ON and OFF. The one-
itations and possible future directions for this research dimensional system is able to implement only eight of
work. Finally, the paper is concluded in Section 10. the 16 possible functions because the initial state is a
function of the sum of the inputs [7]. This problem can
be resolved by assigning each of the inputs to its own
2 Chaos Computing Preliminaries state variable.
Chaotic circuits are made reconfigurable and flexi-
Despite existing proof of tremendous success of digital ble by changing parameters in the circuit topology. In
systems, there are areas where the systems do not meet 1998, Sinha et al. proposed that chaotic systems can
the demands and specifications of current applications. be used to build computers [32]. Murali et al. used a
Traditional computer systems are built from Boolean thresholding mechanism to implement a NOR gate with
circuits which contain switches (transistors). The tran- continuous-time chaotic system [30]. Rizk et al. also ap-
sistors open and close based on the the incoming input plied a threshold technique to Chua’s circuit in order
applied to the gates of the transistors. Different circuit to obtain all the functions such as AND, OR, NAND,
topologies need to be designed in CMOS technology to NOR and NOT. Chua’s circuit has been used to de-
implement different logic functions [21]. Hence, digital sign a flipflop which is a building block for memory
systems require millions of transistors which may cause devices [4]. Rose constructed a chaos based arithmetic
problems such as excessive power consumption and heat logic unit where different functions are selected by al-
production in a chip. As a solution to this problem, non- tering the control input and iteration number [31]. Bohl
A Chaos-based Complex Micro-Instruction Set for Mitigating Instruction Reverse Engineering 3
and p − M OS = 0.12 µm/60 nm. The transistor sizes

are chosen after experimenting with several different ge-
ometries and comparing their performance in regards to
area, power, delay and the width of chaotic region in
the bifurcation diagram. The DC characteristics of the
map circuit for different values of Vc is shown in Fig.
2. As can be seen from the figure, the output is a V-
shaped curve revealing the inherent properties of the
map circuit which is desired for generating chaos.
The bifurcation diagram in Fig. 3 displays a thou-
sand iterations of the steady state output values against
Fig. 1: Chaos circuit for implementing 3-input logic op- the bifurcation parameter, Vc . For values of Vc < 520
erations. mV , the output has a stable period of 2, i.e. it oscillates
et al. proposed the use of chaos in the field of security between two values. At Vc = 520 mV , the first bifur-
for logic obfuscation and mitigation of power analysis cation occurs resulting in period doubling. The next
based side channel attack. The chaotic power profile of period doubling occurs at 637 mV resulting in a sta-
the logic computations helps to eliminate side channel ble orbit of 8 cycles. The rate of period doubling keeps
attacks [3]. Chaos computing is still a nascent field with on increasing progressively faster and eventually as Vc
plenty of opportunities for innovative ideas and exciting approaches 642 mV , the circuit enters into chaotic re-
research. This work proposes the use of 3-input multi- gion and starts oscillating aperiodically. Three distinct
output logic gates in order to mitigate power analysis chaotic regions are clearly visible in this figure with
based side channel attack on micro-instructions by ap- in-between periodic windows. The evolution of output
plying various classification algorithms. voltage for four values of bifurcation parameter is shown
in Fig. 4. Here, the number of iteration, n is shown in
x − axis and corresponding output voltage, vn is shown
3 Chaotic Map Circuit in y − axis for 60 iterations. The iterated outputs are
discrete points indicated by red dots which are joined
If a non-linear function, f transforms any point xn from by straight lines to make the sequence clearly visible.
a closed interval l = [a, b] into some point xn+1 in the In order to implement a chaos generator, two sample-
same interval i.e. f : l → l, then and-hold (S/H) circuits are required along with the map
circuit as shown in Fig. 1. A two phase clock is used
xn+1 = f (xn )
where the output of the map circuit is sampled onto ca-
is called a discrete map of the interval l for n = 1, 2, ..., n. pacitor, C1 in the first phase, φ1 . In the second phase,
A one-dimensional non-linear map is the easiest way to φ2 , the output is transferred to the second capacitor, C2
generate chaotic signals in discrete time. Generally map where it is held at the input of the map circuit for the
circuits are designed to imitate one of the widely known next iteration [9]. A 9-bit digital-to-analog converter
chaos maps such as a logistic map or a tent map. How- (DAC) is used to convert the binary input signal made
ever, any differentiable unimodal function can portray of 3 data (D) bits and 6 control (C) bits into analog
the same qualitative behavior. Feigenbaum has studied input of the map. After a certain number of iterations
this universality of iterated map’s route from periodic- (φs ), sampled output is compared against a threshold
ity towards chaos in great detail [11, 12]. Chaos maps voltage, δ using a comparator and a binary output, O
are also achievable with V-shaped functions similar to is generated. In addition to Vc and C, the functionality
an inverted tent map where the branches are defined by also depends on the choice of φs and δ.
different equations. The worst-case delay occurs when the input goes
In this paper, a three transistor map circuit shown from V DD (1.2 V ) to GN D (0 V ). It can be seen in the
in Fig. 1 is used to generate chaotic signals in discrete semi-logarithmic plot in Fig. 5 that the worst-case delay
time. This paper uses a scaled down implementation of increases very fast as the value of Vc increases. So, it is
a well-studied chaotic map circuit topology [9,20,22] in prudent not to use the third chaotic region above Vc =
65 nm process. The map characteristics can be altered 890 mV , where the delay is very high. Implementing
by changing the bias voltage, Vc which is also the bi- logical functions requires several iterations so the design
furcation parameter of the map circuit. The transistors has been constrained within the first two chaotic regions
in the map circuit were sized as follows: top n − M OS where the delay is relatively small. The feedback circuit
= 1.2 µm/60 nm, bottom n − M OS = 0.12 µm/60 nm along with the DAC and the comparator used in this
Table 1: Evolution of chaotic output with iterations (VC

= 690 mV , C = (111010)2 = 58, δ = 1.03 V , φs = 5).
xn (V) y (V)
i=1 i=2 i=3 i=4 i=5
0.136(000) 1.2 0.52 0.36 1.06 0.43
0.287(001) 1.19 0.51 0.39 0.91 0.35
0.437(010) 0.72 0.27 1.19 0.51 0.38
0.587(011) 0.24 1.2 0.52 0.37 1.02
0.737(100) 0.27 1.19 0.51 0.38 0.96
0.888(101) 0.34 1.13 0.47 0.56 0.25
1.038(110) 0.42 0.79 0.3 1.18 0.51
1.188(111) 0.51 0.39 0.93 0.36 1.04
Fig. 4: Evolution through successive iterations for dif-

ferent values of bifurcation parameter.
Fig. 5: Worst-case delay changes with bifurcation pa-

Fig. 2: Transfer curve of map circuit. rameter, Vc .
Fig. 3: Bifurcation diagram of the chaotic oscillator.

work is modeled in Verilog-A and integrated with the 3- Fig. 6: Schematic for implementation of a chaos-based
transistor non-linear circuit using Cadence Spectre for full adder circuit.
simulation.
like ADD, SUB, decoder and encoder require a com-
mon pair of (Vc , C) for all the output functions. For
4 Complex Micro-Instruction Set Design example, in order to design a decoder, we need 8 pos-
sible 3-input functions with the same Vc and C along
The input-output data for different values of Vc was with eight different values of δ and φs . As a result, it is
obtained from Cadence Spectre. A MATLAB code was possible to use a single chaotic map circuit to get mul-
developed to explore the large design space spanned by tiple functionality. A schematic diagram of 1-bit full
four control parameters and was used to set the con- adder and 3 × 8 decoder is shown in Fig. 6 and Fig. 7
figuration for all the instructions. In order to limit the respectively. An example of evolution of analog values
delay within reasonable time, the analysis has been con- for different possible inputs of a 3-input AND configu-
strained within 21 iterations. Multi-output functions ration is shown in Table 1.
Fig. 8: Percentage of total variance in the first few prin-

cipal components.
Fig. 7: Schematic for implementation of 8 Boolean func-

tions of a 3 × 8 decoder using a single chaos circuit.
5 Adversarial Model Fig. 9: Sum of difference of means among the instruc-

tion sets’ power profiles from the training data set used
It is assumed that an attacker can gather a set of la- for cross-obfuscation classification.
beled power traces and this information can be used
to classify unknown instructions. The attack model is 5.2 Dimensionality Reduction
a supervised learning problem using multi-class (11 in
this work) classification. This work tries to tackle the As the amount of data gathered for different classes
problem of power analysis based side channel attack us- grow, the curse of dimensionality makes it very dif-
ing several methods. The general procedure is described ficult to work with the entire feature space. Dimen-
in the following subsections. sionality reduction techniques are feature selection al-
gorithms used to compress data while preserving the
variance of the original data as much as possible. In lit-
erature, several dimensionality reduction methods have
5.1 Data Collection been proposed. In this paper, we have implemented five
techniques including two of the most common tech-
niques namely, Principal Components Analysis (PCA)
At first, 2000 observations were collected for each of
and Fishers Linear Discriminant Analysis (FLDA). In
the 11 instructions namely, AND, OR, XOR, NAND,
addition, we have implemented three other methods,
NOR, XNOR, ADD, SUB, MUX, DEC (decoder) and
Sum of Difference of Means (SDM), Means-PCA and
ENC (encoder). ADD and SUB have 3 inputs and 2 out-
Means-Variance (MV).
puts, DEC has 3 inputs and 8 outputs and ENC has 3
inputs and 2 outputs. The remaining seven instructions
have 3 inputs and 1 output. The collected data has been 5.2.1 Principal Components Analysis (PCA)
partitioned into training and testing sets. 1600 (80%)
observations have been used to train the classifier and Principal Components Analysis (PCA) tries to reduce
then the classifier is used to predict the label of remain- the dimensionality of the data while maximizing its
ing 400 (20%) test data. This partitioning of data has variance to the full extent. This is achieved by pro-
been used in all the classification results shown in this jecting the data orthogonally onto a lower dimensional
paper. subspace. This lower dimensional subspace can be de-
fined by a D-dimensional unit vector, u1 . The projec- reduction technique geared towards multi-class classifi-
tion of each observation, xn , onto this subspace is given cation, a reasonable choice is to take the feature points
by u1 T · xn . If all the observations are stacked up into accounting for the maximum variance of the original
a matrix, the projection of each row of the matrix can data across the different classes. In order to identify
be represented as UT X, where U is a matrix consisting these points the mean of each class, µk is needed, where
of eigenvectors of the covariance matrix, σ. The projec- 1 ≤ k ≤ K. If the mean values are put into a matrix
tion of the observations into a D-dimensional subspace (with k-th row being the mean of the k-th class), a
that maximizes the projected variance is given by D K × L matrix will be created, where L is the dimen-
eigenvectors, u1 , ..., ud with the D largest eigenvalues sion of the original data. The variance of each column
λ1 , ..., λd [34]. The effectiveness of PCA depends on the is the inter-class variance of each feature point. Finally,
number of reduced dimensions and on the nature of the the dimension is reduced by taking the first D columns
analyzed data. In this work, first few principal compo- with the highest variance.
nents contain most of the variance of the features as
shown in Fig. 8. While using PCA, the dimensionality
of the problem has been reduced to first 30 principal 5.2.5 Fishers Linear Discriminant Analysis (FLDA)
components for all cases since most of the information
is retained in the reduced data. Fishers Linear Discriminant Analysis (FLDA) is an ap-
proach used in pattern recognition to find a linear com-
5.2.2 Means-PCA bination of features which characterizes two or more
class observations [8,14]. The resulting combination may
PCA maximizes the overall variance of class observa- be used for dimensionality reduction before classifica-
tions but does not take the variance between classes tion. However, instead of maximizing the variance of the
into account. A reasonable choice is to maximize the intra-class data like PCA, information regarding the co-
variance of inter-class observations since moving the variance of different classes is taken into consideration.
class means apart may result in a higher classification These are the between-class and within-class covariance
rate. Here the class means are considered as instances matrices. If N number of L-dimensional observations
and the projection coefficients are computed using the for each class, C are considered, then the within-class
techniques discussed in Section 5.2.1. These projection covariance, σW and the between-class covariance, σB
coefficients are then used to transform the observations. are computed as:
In this method, the number of reduced dimensions are
K − 1, where K = number of classes. k
X
σW = Ni σ i (1)
5.2.3 Sum of Difference of Means (SDM) i=1
In the past, differential power has been used to correlate and

information leakage of a device with its power consump-
tion [24]. In this work, this method has been used to re- k
X
duce the dimensionality of the instruction-level traces. σB = (µi − µ)(µi − µ)T (2)
The method of computing the new dimension, D, from i=1
the original dimension, L is as follows. First the abso-
lute difference between each pair of mean vector is cal- where Ni , k, µ, σi and µi are the number of observa-
culated. Then, these differences are summed and finally, tions, the number of classes, global mean, the covari-
first D points among the highest peaks are chosen. The ance and the mean of the power trace for each class Ci ,
results of the training data set used in cross-obfuscation respectively. A D-dimensional unit vector u1 is consid-
classification is shown in Fig. 9. The first 30 highest ered onto which the data is projected. This time the ob-
peaks are chosen for all cases so that the results are jective is to maximize both the projected between-class
consistent with PCA. and the projected within-class covariance. It has been
−1
shown in [8] that u1 has to be the eigenvector of σW σB
5.2.4 Means-Variance(MV) for maximization of projections. The D-dimensional sub-
space is created by the first D eigenvectors u1 , . . . , uD
−1
Means-variance is another method of dimensionality re- of σW σB with the largest eigenvalues λ1 , . . . , λD . The
duction motivated by the same underlying principle as maximum value of D can be K − 1 which is used in this
SDM described in Section 5.2.3. For a dimensionality work.
5.3 Instruction Classification 5.3.2 Support Vector Machine (SVM)
After the templates are created using dimensionality re- A support vector machine or SVM [19] is a supervised
duction techniques, the next step is to use a classifica- learning algorithm primarily used for classification. Given
tion method to classify the test data and determine its a set of training examples, each marked as belonging to
accuracy. In a supervised learning setting, the training one or the other of two categories, an SVM training al-
data is an ordered pair (x, y) where x is an instance and gorithm builds a model that assigns new examples to
y is its class label. The goal of the algorithm is to assign the appropriate category making it a non-probabilistic
a class for a given instance x. Many different classifiers binary classifier. An SVM model is a representation of
are used in machine-learning problems and their rela- examples as points in space mapped in such a way that
tive superiority depends on the speed, implementation the examples of separate categories are divided by a
cost, accuracy and most importantly, the nature of the wide gap. New examples are mapped into the same
problem. In this subsection, we briefly discuss several space and predicted to belong to a category depend-
classification algorithms used in this work. ing on which side of the gap they fall into. SVM can be
used as a non-linear classifier by using suitable kernel
functions e.g. Gaussian radial basis function. The stan-
5.3.1 k-Nearest Neighbors Algorithm (kNN)
dard SVM supports only binary classification, but it
can be extended by transforming multi-class classifica-
The kNN is a non-parametric lazy supervised learn-
tion to multiple binary classification problems [17]. De-
ing algorithm. The algorithm is called non-parametric
pending on the application, different number of binary
because it does not make assumptions about the data
classifiers such as ‘onevsone,’ ‘onevsall,’ ‘binary com-
and data generalization is not needed. In this algorithm,
plete,’ and ‘denser random’ are used in practice. In this
the training means storing the training data along with
work, we have reported the results for ‘onevsall’ and
their class labels. During classification, the classifier
‘onevsone’ techniques and the results also show a K-
computes the distance between the instance, x and all
way multiclass problem, ‘onevsall’ and ‘onevsone’ train
training instances, x ∈ X. It then keeps the k closest
k and k(k−1) binary classifiers, respectively.
training instances, where k ≥ 1. The class that is most 2
common among the instances is assigned to x. In kNN 5.3.3 Decision Tree(DT)

there are two major design choices; (a) the value of k
and (b) the distance function. The most common dis- A decision/classification tree is a simple representation
tance function used in kNN is the Euclidean distance for classifying examples. Each internal node of the tree
function [36] [6]. The Euclidean distance, deuc , between is labeled with an input feature. The arcs coming from a
two instances x1 and x2 is computed as: node are labeled with each of the possible values of the
v
u L output feature or the arc leads to a subordinate decision
1 2
uX node on a different input feature. Each leaf of the tree is
deuc (x , x ) =t (x1i − x2i )2 , (3) labeled with a class or a probability distribution of the
i=1
classes. In order to predict a response, one has to follow
the decisions in the tree from the root node down to a
where x1 and x2 have L features and x1i and x2i are the
leaf node which contains the results of classification.
i-th sample points.
In this work we have used decision tree classifier using
We have used two more distance functions namely,
‘onevsall’ and ‘onevsone’ techniques.
Correlation and Cosine learning distance functions. Cor-
relation distance is measured as: 5.3.4 Discriminant Analysis (DA)
PL 1 2 T
( L
P
x1 )( L
P 2 Discriminant analysis create a linear combination of
i=1 (x .(x ) ) i=1 x )
1 2 L − i=1 L 2 features that characterizes or separates two or more
dcor (x , x ) = . (4)
σx1 σx2 classes. It is often used for dimensionality reduction
as described in section 5.2.5. For two classes, DA ap-
The cosine distance between two points x1 and x2 proaches the problem by assuming that the conditional
can be defined as: probability density functions p(x|y = 0) and p(x|y = 1)
are both normally distributed with mean and covari-
ance parameters µ0 , Σ0 and µ1 , Σ1 , respectively. In
x1 .(x2 )T
dcos (x1 , x2 ) = qP . (5) this work, we have used Linear Discriminant Analy-
L 1 2
PL 2 2
i=1 (xi ) i=1 (xi ) sis based on the assumption of homoscedasticity, i.e.
Σ0 = Σ1 and that the covariances have full rank. In signatures. Therefore, instructions can be classified us-
this paper, Linear Discriminant Analysis leverages the ing a sufficient amount of power data by applying dif-
‘onevsall’ and ‘onevsone’ techniques. ferent random operands. As demonstrated in previous
work, instructions implemented on a traditional CMOS
5.3.5 Naive Bayes(NB) based processor can be classified with a high accuracy
[28, 29].
Naive Bayes is a conditional probability model based This work shows the classification of CMOS based
on Bayes’ theorem with additional simplifying assump- traditional implementation of the instruction set which
tions. Given a problem instance to be classified, repre- has been performed using all the classification algo-
sented by a vector x = (x1 , . . . , xn ) which has n features rithms described earlier. For each classification algo-
(independent variables), it assigns to the instance prob- rithm, several dimension reduction techniques are used.
abilities, p(Ck |x1 , . . . , xn ) for each of k possible out- The data has also been analyzed with no reduction tech-
comes or classes and label the data as belonging to the nique performed on it. Classification results for the dis-
class with the highest probability. Combining Bayes’ cussed techniques are tabulated in Table 2. The best
theorem and very simplistic conditional independence classification accuracy was achieved for 1-NN with co-
assumption, theQproblem boils down to determining the sine distance after using all the sample points of the
n
value of p(Ck ) i=1 p(xi |Ck ) for each class and then dataset. A confusion matrix showing detailed result for
choosing the class with maximum value. In this work, each instruction of this classifier is shown in Table 3.
the prior probability is estimated from the training set The overall accuracy is 94.2% which is very close to the
and features are assumed to follow a Gaussian distri- ideal value of 100%.
bution.
7 Cross Obfuscation
5.3.6 Multivariate Gaussian Probability Density
Function As proven in Section 6, traditional implementation of
instructions in a processor can be accurately profiled
Given µk and σk of each instruction, classification is based on its power trace. Instruction power profiles can
performed as follows. Let W be the power consumption be successfully used to reverse engineer instructions on
waveform captured at runtime and assuming that its any other machine using classification algorithms de-
samples are drawn from a Multivariate Gaussian Nor- scribed earlier. However, with chaos-based computing,
mal Distribution model [15]. The noise introduced into functionality can be chosen from a large space where a
the power waveform, W, is extracted by subtracting the single function can be implemented using different con-
mean value from the waveform as in figurations, each with a unique power signature. Conse-
quently, profiling instructions based on the power sig-
nk = (W[1] − µk [1]), (W[2] − µk [2]), .., (W[p] − µk [p])
natures from a reference machine is not sufficient to
(6)
classify instructions in other machines. This idea was
where, µk is the mean of instruction Ik and p is the first proposed in [28], where seven two-input ALU in-
number of selected features after the original dimen- structions were implemented using three basic chaotic
sionality is reduced. The probability of observing the logic gates, AND, OR and XOR. A sequence of Vc along
noise, nk in the device’s power trace is then computed with variable number of iterations φs was used with no
as: control bit and fixed threshold.
In this work, the same technique has been extended
1 1
N (nk , (µk , σk )) = exp(− (nk )σk−1 (nk )T ). (7) for preventing classification attack among eleven three-
(2π)D/2 2 input multi-output operations, each implemented with
The instruction with the template that generates a single chaos-based logic gate. The control operation
the highest probability of observing noise, nk is classi- has been simplified by using a single Vc (chosen inside
fied as the correct instruction. the chaotic region in Fig. 3) for a particular operation
with variable threshold, δ and 6-bit control input, C for
expanding the design space.
6 CMOS Classification Five different set of configurations are chosen in the
chaos circuit for getting all the instructions in the in-
Traditional CMOS based implementation of the instruc- struction set. As already described, 4 different parame-
tion set is vulnerable to power based classification at- ters of the chaos circuit comprise the configuration for
tack. Each instruction exhibits distinguishable power each operation. The parameters are bias voltage, Vc ,
Table 2: Classification accuracy among instructions using different classifiers and dimensionality reduction algo-
rithms for CMOS implementation.
Category Sub-category Dimensionality Reduction Algorithm

No Reduction PCA PCAmean SDM MV FLDA
KNN 1-NN(euc) 92.27 92.09 91.11 80.5 78.72 86.36
3-NN(euc) 91.84 91.80 90.84 79.2 77.96 87.04
5-NN(euc) 91.44 91.45 90.57 79.01 78.23 87.35
1-NN(corr) 85.31 93.84 92.05 20.5 19.32 85.47
3-NN(corr) 93.35 93.39 91.9 54.33 52.01 86.4
5-NN(corr) 93.07 92.94 91.39 70.24 67.69 86.05
1-NN(cos) 94.2 93.89 92.23 79.82 76.98 86.05
3-NN(cos) 93.57 93.27 91.89 79.29 76.89 87.02
5-NN(cos) 93.17 92.9 91.44 78.98 76.79 86.77
SVM onevsall 82.7 80.66 73.5 58.3 57 82.47
allpairs 90.55 89.14 84.23 78.23 74.61 85.86
DT onevsall 88.61 86.55 85.98 75.2 75.31 83.73
allpairs 91.84 89.91 88.82 77.14 77.52 85.16
LDA onevsall 78.86 70.95 66 65.34 63.57 78.93
allpairs 93.95 92.23 83.55 81.91 81.1 81.73
NB onevsall 43.59 69.93 77.61 52.27 54 82.52
allpairs 63.95 78.02 78.39 52.52 53.98 82.07
MVG N/A 71.22 70.27 94.13 91.61 91.11 87.16
Table 3: Confusion matrix of classification accuracy for different instructions in CMOS implementation. Rows and
columns represent the test instruction and percentage of their matched class respectively.
Instruction Matched Class (%)

AND OR XOR NAND NOR XNOR ADD SUB MUX DEC ENC
AND 94.75 1.75 0.25 2.5 0 0 0 0 0 0 0.75
OR 1.25 82.75 0 7.75 1.25 0 0 0 1 1.25 4.75
XOR 0.25 0 95.5 0 0 4.25 0 0 0 0 0
NAND 1.75 5.5 0 91.5 0.5 0 0 0 0.5 0 0.25
NOR 0 0 0 0 96.2 0 0 0 0.25 0 3.5
XNOR 0 0.25 2 0 0 97.25 0 0 0 0.25 0.25
ADD 0 0 0.75 0 0 0 94 5.25 0 0 0
SUB 0 0 0 0 0 0 1.5 98.25 0.25 0 0
MUX 0 0.25 0 0 0 0 0 0 98.25 0 1.5
DEC 0 0 0 0 0 0 0 0 0.25 99.75 0
ENC 0 0.75 0 1 6.75 0 0 0 2 1.5 88
Table 4: Different configurations for instruction set used in cross-obfuscation
Operation Configurations
Config.1 Config.2 Config.3 Config.4 Config.5
Vc C δ φs Vc C δ φs Vc C δ φs Vc C δ φs Vc C δ φs
AND 0.69 58 1.02 5 0.74 63 1.17 3 0.62 0 0.69 4 0.69 58 1.02 9 0.72 41 1 9
OR 0.71 40 0.3 3 0.65 26 0.3 5 0.62 46 0.27 8 0.63 24 0.27 8 0.74 49 0.24 3
XOR 0.65 0.42 28 5 0.67 41 0.51 7 0.73 63 0.5 15 0.71 23 0.61 15 0.69 54 0.96 15
NAND 0.74 63 0.32 5 0.66 61 0.34 4 0.62 0 0.4 3 0.65 39 0.38 3 0.7 8 0.25 3
NOR 0.63 34 0.61 7 0.69 54 1.17 8 0.74 63 1.19 1 0.72 45 1.08 10 0.66 60 0.78 21
XNOR 0.65 28 0.69 6 0.66 63 0.81 15 0.69 63 0.82 13 0.71 23 0.42 14 0.72 62 0.51 18
ADD 0.65 28 0.42 5 0.65 40 0.51 5 0.65 45 0.42 5 0.65 26 0.42 5 0.72 58 0.52 19
SUB 0.65 28 0.69 6 0.65 48 0.77 6 0.68 41 0.51 8 0.65 51 0.6 6 0.72 57 0.83 15
MUX 0.69 62 0.55 7 0.62 38 0.63 8 0.65 55 0.94 8 0.65 53 0.86 0.65 0.74 19 1.018 11
DEC 0.64 36 1.09 15 0.68 23 0.92 10 0.71 63 1.18 8 0.67 49 0.72 6 0.68 7 0.49 21
ENC 0.71 50 0.95 10 0.67 4 0.7 5 0.62 36 0.76 2 0.65 39 0.53 2 0.74 0 1.03 6
in configuration 2 to be confused during classification

with other instructions in configuration 1.
A summary of the classification attack is tabulated
in Table 5 for different classification algorithms and di-
mensionality reduction techniques discussed in Section
5. In order to avoid clutter, we have only included ‘onev-
sone’ technique for SVM, DT, LDA and NB classifiers
and 1-NN results with cosine distance function for KNN
since these are the classifiers that produced the best re-
sults. Analyzing the results in the table, it is possible
to find the best case classification and dimensionality
reduction algorithm for each of the cross obfuscation
experiment cases.
The results clearly show that when the training and
testing are done for the same configuration (1,1), the
classifier has very high accuracy (91.91% for SVM with
FLDA). The classification accuracies of the remaining
Fig. 10: Classification accuracy of instructions among cases are relatively much smaller showing the efficacy
different chaos configurations for best case classification of cross-obfuscation methodology. The right most col-
method, SVM with different dimensionality reduction umn in Table 5 shows average accuracy of four cross-
techniques. obfuscation cases (1,2 - 1,5). Best average classification
accuracy of 14.91% among the four pairs was achieved
control input, C, comparator threshold voltage, δ and for SVM with ‘onevsone’ classifier using the FLDA di-
output sampling iteration, φs . The values of the pa- mensionality reduction technique. Classification accu-
rameters used in the five configurations are tabulated racy for each of the cross obfuscation cases among 5
in Table 4. For multi-output functions like ADD, SUB, configurations with best case classification for all di-
DEC and ENC, (δ, φs ) pair for the first output is shown. mensionality reduction techniques is shown in the bar
plot of Fig. 10. Detailed classification results for one
Cadence Spectre simulator was used to generate 2000
train-test pair (1,4) are shown using confusion matrix in
power traces for each instruction where every trace cor-
Table 6. It is to be noted that the configurations shown
responds to random transitions of the 4-bit operands
in Table 4 were randomly chosen searching through a
in the instruction. Out of the 2000 power traces, 1600
large design space. As a result, it is possible to get two
traces are used for training and the remaining 400 traces
configurations very close to one another resulting in a
are used for testing the classification accuracy. For 400
higher classification accuracy. However, this classifica-
testing power traces of each instruction, the matched
tion accuracy can be made even lower by deliberately
instruction class is found for each of the traces in the
choosing configurations in a way so that it maximizes
testing set. A cross classification analysis has been per-
the difference in power traces among similar instruc-
formed among instructions from the 5 configuration
tions from different configurations.
sets shown in Table 4. In each of these classification
experiment, training power samples are collected from
instructions of one configuration of chaos circuit and 8 Self Obfuscation
test samples from another. We performed our experi-
ment with training data from configuration 1 and test In addition to cross-obfuscation, chaos-based design pro-
data from all configurations, 1 through 5. The training vides a unique method for self obfuscation where differ-
data is used to train several classifiers with different di- ent instructions from the same machine have similar
mensionality reduction techniques. Then it is used to power traces making it very difficult for classification.
classify instructions from the testing data set obtained The power trace is mostly determined by the choice of
from all configurations. This experiment emulates the Vc and C. If a set of instructions can be implemented
case where an attacker collects instruction power pro- using the same Vc and C, using different δ and φs , then
files from processor 1 and uses the profiles to classify in- their power traces become very similar. The configura-
structions on another processor, say processor 2. Here, tion used in this method is shown in Table 7. The value
processor 1 and processor 2 use configurations 1 and 2, of Vc and C have been fixed at 0.69 V and 58((111010)2 )
respectively, for the implementation of the instruction and other control parameters were varied to obtain de-
set. With this methodology, we expect each instruction sired functionality. For multi-output functions like DEC
Table 5: Classification accuracy of instruction set among different chaos-based machines (cross-obfuscation) using
several classification and data reduction algorithms.
Category Data Reduction Configuration Pair

1,1 1,2 1,3 1,4 1,5 Average(1,2-1,5)
KNN(=1) No Reduction 89.18 11.73 5.18 8 6.36 7.82
PCA 88.91 8.36 4.91 5.18 5.36 5.95
PCAmean 85.64 11.73 6.18 6.09 5.91 7.47
SDM 73.55 11.09 8 9.09 8.64 9.2
MV 74.55 11 8 8.91 8.64 9.14
FLDA 89.36 16.27 7.09 8.82 10.09 10.57
SVM No Reduction 90.36 13.09 9.36 15.45 15.27 13.3
PCA 91.09 15.27 6.09 14.27 15.27 12.73
PCAmean 89.82 12.91 6.27 10 12 10.3
SDM 51 13.73 7.55 8.27 8.09 9.41
MV 51.64 14.55 6.73 8.55 7.82 9.41
FLDA 91.91 14.91 10.91 15.91 17.91 14.91
DT No Reduction 88 7.09 8.36 11.27 6 8.18
PCA 86.36 10.36 9.27 11.73 10.82 10.55
PCAmean 86.73 11.09 5.73 11.09 10.45 9.59
SDM 82 10 5.91 8.09 6.45 7.61
MV 81.18 14.36 6.55 7.45 6.18 8.64
FLDA 90.82 17.09 5.45 12.91 13.27 12.18
LDA No Reduction 90.73 11.64 3.18 2.82 2.27 4.98
PCA 90.82 14.55 7.55 14.73 14.91 12.93
PCAmean 89.2 12.18 9 12.91 14.36 12.11
SDM 72.73 10.45 5.36 3.18 5.45 6.11
MV 77.18 5 5.27 4.27 4.27 4.7
FLDA 91.27 19.36 6.73 14.18 18 14.57
NB No Reduction 82.36 12.73 11.73 11.64 10.27 11.59
PCA 85.09 8.73 8.82 10.36 10.55 9.61
PCAmean 85.09 11.73 8.45 12.55 11.55 11.07
SDM 43.09 12.64 9.73 8.64 6.82 9.45
MV 42.27 11.82 8.64 8.55 6.82 8.95
FLDA 91.27 23.09 8.18 16.18 17.36 16.2
MVG No Reduction 81.09 0.55 11.27 11.64 10.82 8.57
PCA 92 17.27 5.09 14.09 14.9 12.84
PCAmean 91.18 15.45 5.18 11.09 11.91 10.91
SDM 81.18 8.27 9.45 10.36 9.36 9.36
MV 77.27 9.91 8.82 10.45 7.91 9.27
FLDA 91.45 18.82 4.82 14.18 15.82 13.41
Table 6: Confusion matrix of classification accuracy of different instructions for chaos-based cross-obfuscation im-
plementation. Rows and columns represent the test instruction and percentage of their matched class, respectively.

AND OR XOR NAND NOR XNOR ADD SUB MUX DEC ENC
AND 0 0 0 0 80 0 0 0 0 20 0
OR 0 0 0 0 85 0 0 0 0 15 0
XOR 0 25 0 0 0 0 0 0 24 11 40
NAND 0 1 0 0 5 0 0 0 44 15 35
NOR 0 4 0 1 0 0 0 0 52 10 33
XNOR 0 26 0 7 0 0 0 0 23 12 32
ADD 0 0 62 0 0 37 0 0 0 0 1
SUB 0 0 0 7 2 0 12 79 0 0 0
MUX 0 3 0 0 1 0 0 0 61 21 14
DEC 0 34 0 0 0 0 0 0 42 0 24
ENC 0 0 0 0 4 0 0 0 39 22 35
and ENC, (δ, φs ) pair only the first output is shown. an example to illustrate the premise that chaos-based
This strict design choice constrained the design space. design can be effective for immunity against side chan-
However, it does not create design issues since multiple nel attack. However, the design methodology does not
configurations are not required for this method. The depend on this particular three transistor chaotic map
results are shown in Table 8. The best results were ob- circuit topology. Any combination of ingenuous topol-
tained for NB with ‘onevsone’ technique with a classi- ogy and/or emerging device can be used as long as we
fication accuracy of 11.67%. This result is very close to get the ‘V’ shape, or alternatively, inverted ‘V’ or tent
the ideal value of 11.11% (1/9) for perfect obfuscation shape transfer curve. New device and/or topology can
involving 9 instructions. The confusion matrix repre- improve the overhead related to chaos-based design to
senting detailed classification results for this classifier make it competitive against conventional CMOS de-
is shown in Table 9. signs. Moreover, in order to overcome the susceptibility
The results for ADD and SUB are not shown in of chaotic gates to noise, detailed noise analysis needs
these tables, since they have sequential operation in to be done to make the design robust.
contrast to the other 9 instructions. Therefore, even As can be seen from Table 5, cross obfuscation ac-
with same Vc and Cb , they can not be fully obfuscated curacy varies among various designs. More quantitative
and can be classified with a relatively high accuracy. measures and metrics can be developed to aid the de-
The accuracy of classifying ADD in this configuration sign process for multiple implementations.
using NB classifier with ‘onevsone’ technique is 77.5%. Self-obfuscation is a more advanced form of obfus-
Additional design techniques have to be used for obfuscation which renders the power traces of an instruc-
cating such instructions. tions set inherently indistinguishable. However, to get
the same Vc and C for all the instructions is a very
challenging task. Moreover, in its current form it can
9 Discussion and Future Work
not obfuscate sequential instructions such as ADD and
The classification results of a complex micro-instruction SUB. New design techniques need to be explored to
set using power traces for traditional CMOS implemen- overcome this short-coming.
tation as well as two obfuscation schemes using chaos- The overhead of using chaotic map for digital circuit
based systems are reported in this paper. The results implementation needs to be reduced. Since the main
from various classifiers using five different dimensional- objective of this work is to illustrate that chaos-based
ity reduction techniques are shown for comparison. The systems can be used to implement arbitrary complex
classification accuracy clearly shows that CMOS imple- 3-input multi-output functionality and that the corre-
mentations are vulnerable to side channel power analy- sponding large design space can enable designers to se-
sis attack, whereas chaotic circuit implementation can cure hardware against side channel power attack for
alleviate this problem to a large extent. The cross ob- instruction classification. It is a fact that if a single
fuscation scheme can produce indistinguishable power basic gate is implemented, the chaotic design has large
traces provided the gate design is done carefully in overhead. However, the chaotic design is reconfigurable,
a manner so that different implementations of identi- and unlike CMOS implementation, one can implement
cal instruction have sufficiently different configurations. both simple basic gates and relatively complex func-
The self obfuscation method is an extension of cross tionality like ADD, SUB, MUX and DEC using the
obfuscation which gets rid of designing multiple imple- same configuration. Moreover, as shown in [28], a suit-
mentations since the power traces of a single implemen- able combination of chaotic and CMOS gates can lead
tation can be made almost indistinguishable when two to an almost ideal obfuscation. The preliminary results
of the four configuration parameters namely Vc and C show that the number of chaos gates required increases
are chosen to be identical across the instruction set. The roughly logarithmically with the number of bits. More
obfuscation is almost ideal even for the best performing careful analysis needs to be done to come up with an op-
classifier. timization method combining chaotic and CMOS gates
There are many opportunities for extending this for overhead reduction.
work in the future. First of all, the map circuit used
in this work is a scaled down topology from [9]. De-
tailed analysis is required to optimize the circuit de- 10 Conclusions
sign to reduce power consumption, delay and area. The
width and location of chaotic region in the bifurcation In this work, it is shown that a chaos-based design can
diagram is also an important factor in choosing design be used to generate simple as well as complex 3-input
parameters. This particular chaotic map is chosen as multiple-output functions using a simple 3-transistor
Table 7: Classification of different instructions for chaos-based self obfuscation implementation. Rows and columns
represent the test instruction and percentage of their matched class, respectively.
Config Instructions
AND OR XOR NAND NOR XNOR MUX DEC ENC
δ 1.02 0.28 0.46 0.37 1.18 0.56 0.44 1.18 0.93
φs 5 7 18 4 1 19 9 1 3
Vc 0.69
C 58
Table 8: Classification accuracy among instructions using different classifiers and dimensionality reduction algo-
rithms for chaos-based self obfuscation implementation.
Category Sub-category Dimensionality Reduction Algorithm

No Reduction PCA PCAmean SDM MV FLDA
KNN 1-NN(euc) 10.1 10.94 11.01 10.79 10.79 10.74
3-NN(euc) 10.76 10.96 10.66 11.29 11.29 10.67
5-NN(euc) 11.07 11.03 11.21 10.94 10.94 10.79
1-NN(corr) 10.9 11.56 11.29 11.11 11.11 10.46
3-NN(corr) 11.26 11.17 11.03 11.11 11.11 10.5
5-NN(corr) 11.18 10.96 11.43 11.11 11.11 10.88
1-NN(cos) 10.35 10.81 10.79 10.47 10.476 10.51
3-NN(cos) 10.53 11.24 10.86 10.47 10.47 10.58
5-NN(cos) 11.32 10.95 11.11 10.72 10.72 10.68
SVM onevsall 11.65 10.81 11.62 11.33 10.94 9.69
allpairs 10.94 10.94 11.27 10.19 10.17 11.08
DT onevsall 10.94 11.06 11.11 10.67 10.47 10.14
allpairs 10.17 11.22 10.61 10.81 10.58 10.44
LDA onevsall 10.25 9.64 10.31 10.83 11.17 11.06
allpairs 10.33 10.33 10.39 10.89 11.44 10.28
NB onevsall 11.25 11.42 11.08 11.25 11.03 11.64
allpairs 10.94 11.67 10.31 11.53 11.03 11.47
MVG N/A 11.11 11.14 10.97 11.14 11.06 11.22
Table 9: Confusion matrix of classification accuracy for different instructions in chaos-based self obfuscation im-
plementation. Rows and columns represent the test instruction and percentage of their matched class, respectively.

AND OR XOR NAND NOR XNOR MUX DEC ENC
AND 4.75 10.75 2.75 5 55 6.75 7.75 2.75 4.5
OR 5.25 12.75 2.5 3.25 52 7.75 7 0.75 8.75
XOR 7 10.75 1.75 3.75 56.5 5.5 6.75 1.25 6.75
NAND 5.75 11.25 2 7.25 51 8.25 8.5 1.75 4.25
NOR 5 9.25 0.75 5 56.75 7.75 6.5 0.75 8.25
XNOR 4 9.75 1.5 4.75 58.5 6.25 6 1 8.25
MUX 5.5 10.25 2.5 6 51.75 7 8.5 1 7.5
DEC 3 10.5 2.25 5 57 6.5 7.75 1.75 6.25
ENC 6.25 13.5 2.75 4.25 54 7.25 5.5 1.25 5.25
chaotic map circuit. The parameters in the chaotic os- same machine. It has been successfully demonstrated
cillator has been chosen carefully in order to minimize using various dimensionality reduction techniques along
delay, area and power consumption. Two different de- with several classification algorithms that the logic func-
sign methodologies have been proposed for obfuscation tions built from properly designed chaos gates can en-
namely, cross obfuscation and self obfuscation. Cross sure security against power based side channel attack.
obfuscation ensures that an attacker cannot reverse en- Proposed design is found to be capable of bringing the
gineer instructions on any other machine by accumulat- accuracy of instruction classification close to a level of
ing data from a reference machine. Self obfuscation is perfect ambiguity in classification.
possible by using chaos gates with a suitable common
configuration for certain parameters since it leads to
similar power traces among different instructions in the
Acknowledgements 18. Jakimoski, G., Kocarev, L.: Chaos and cryptography:

block encryption ciphers based on chaotic maps. IEEE
The authors would like to thank Mesbah Uddin from Transactions on Circuits and Systems I: Fundamental
Theory and Applications 48(2), 163–169 (2001)
the University of Tennessee for his help and support. 19. Joachims, T.: Text categorization with support vector
machines: Learning with many relevant features. In:
European conference on machine learning, pp. 137–142.
Funding Information Springer (1998)
20. Juncu, V., Rafiei-Naeini, M., Dudek, P.: Integrated cir-
This work is based upon work supported by the Air cuit implementation of a compact discrete-time chaos
generator. Analog Integrated Circuits and Signal Pro-
Force Office of Scientific Research under award number cessing 46(3), 275–280 (2006)
FA9550-16-1-0301. 21. Kia, B., Lindner, J., Ditto, W.L., et al.: Nonlinear dy-
namics based digital logic and circuits. Frontiers in com-
putational neuroscience 9, 49 (2015)
References 22. Kia, B., Lindner, J.F., Ditto, W.L.: A simple nonlinear
circuit contains an infinite number of functions. IEEE
1. Aihara, K.: Chaotic neural networks (bifurcation phe- Transactions on Circuits and Systems II: Express Briefs
nomena in nonlinear systems and theory of dynamical 63(10), 944–948 (2016)
systems) (1989) 23. Kia, B., Mobley, K., Ditto, W.L.: An integrated circuit
2. Blümel, R., Kappler, C., Quint, W., Walther, H.: Chaos design for a dynamics-based reconfigurable logic block.
and order of laser-cooled ions in a paul trap. Physical IEEE Transactions on Circuits and Systems II: Express
Review A 40(2), 808 (1989) Briefs (2017)
3. Bohl, J., Yan, L.K., Rose, G.S.: A two-dimensional 24. Kocher, P., Jaffe, J., Jun, B.: Differential power analy-
chaotic logic gate for improved computer security. In: sis. In: Annual International Cryptology Conference, pp.
Circuits and Systems (MWSCAS), 2015 IEEE 58th Inter- 388–397. Springer (1999)
national Midwest Symposium on, pp. 1–4. IEEE (2015) 25. Kolumbán, G., Vizvári, B., Schwarz, W., Abel, A.: Differ-
4. Cafagna, D., Grassi, G.: Chaos-based computation via ential chaos shift keying: A robust coding for chaos com-
chua’s circuit: Parallel computing with application to munication. In: Proc. NDES, vol. 96, pp. 87–92 (1996)
the sr flip-flop. In: Signals, Circuits and Systems, 2005. 26. Lindner, J.F., Kohar, V., Kia, B., Hippke, M., Learned,
ISSCS 2005. International Symposium on, vol. 2, pp. 749– J.G., Ditto, W.L.: Strange nonchaotic stars. Physical
752. IEEE (2005) review letters 114(5), 054,101 (2015)
5. Chua, L.O., Lin, G.N.: Canonical realization of chua’s cir- 27. Lorenz, E.N.: Deterministic nonperiodic flow. Journal of
cuit family. IEEE transactions on Circuits and Systems the atmospheric sciences 20(2), 130–141 (1963)
37(7), 885–902 (1990) 28. Majumder, M.B., Hasan, M.S., Uddin, M., Rose, G.S.:
6. Deza, M.M., Deza, E.: Encyclopedia of distances. In: Chaos computing for mitigating side channel attack. In:
Encyclopedia of Distances, pp. 1–583. Springer (2009) 2018 IEEE International Symposium on Hardware Ori-
7. Ditto, W.L., Miliotis, A., Murali, K., Sinha, S., Spano, ented Security and Trust (HOST), pp. 143–146. IEEE
M.L.: Chaogates: Morphing logic gates that exploit dy- (2018)
namical patterns. Chaos: An Interdisciplinary Journal of 29. Msgna, M., Markantonakis, K., Mayes, K.: Precise
Nonlinear Science 20(3), 037,107 (2010) instruction-level side channel profiling of embedded pro-
8. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classifica- cessors. In: International Conference on Information Se-
tion. John Wiley & Sons (2012) curity Practice and Experience, pp. 129–143. Springer
9. Dudek, P., Juncu, V.: Compact discrete-time chaos gen- (2014)
erator circuit. Electronics Letters 39(20), 1431–1432 30. Murali, K., Sinha, S., Ditto, W.L.: Implementation of nor
(2003) gate by a chaotic chua’s circuit. International Journal of
10. Ercsey-Ravasz, M., Toroczkai, Z.: Optimization hardness Bifurcation and Chaos 13(09), 2669–2672 (2003)
as transient chaos in an analog approach to constraint 31. Rose, G.S.: A chaos-based arithmetic logic unit and im-
satisfaction. Nature Physics 7(12), 966 (2011) plications for obfuscation. In: VLSI (ISVLSI), 2014 IEEE
11. Feigenbaum, M.J.: Quantitative universality for a class of Computer Society Annual Symposium on, pp. 54–58.
nonlinear transformations. Journal of statistical physics IEEE (2014)
19(1), 25–52 (1978) 32. Sinha, S., Ditto, W.L.: Dynamics based computation.
12. Feigenbaum, M.J.: Universal behavior in nonlinear sys- Physical Review Letters 81(10), 2156 (1998)
tems. Physica D: Nonlinear Phenomena 7(1-3), 16–39 33. Srisuchinwong, B., San-Um, W.: Implementation of a
(1983) chuas chaotic oscillator using roughly-cubic-like nonlin-
13. Feki, M.: An adaptive chaos synchronization scheme ap- earity
plied to secure communication. Chaos, Solitons & Frac- 34. Strang, G.: Introduction to linear algebra, vol. 3.
tals 18(1), 141–148 (2003) Wellesley-Cambridge Press Wellesley, MA (1993)
14. Fisher, R.A.: The use of multiple measurements in taxo- 35. Strogatz, S.H.: Nonlinear dynamics and chaos: with ap-
nomic problems. Annals of eugenics 7(2), 179–188 (1936) plications to physics, biology, chemistry, and engineering.
15. Gut, A.: Probability: a graduate course, vol. 75. Springer CRC Press (2018)
Science & Business Media (2013) 36. Wang, L., Zhang, Y., Feng, J.: On the euclidean distance
16. Hayes, S., Grebogi, C., Ott, E.: Communicating with of images. IEEE transactions on pattern analysis and
chaos. Physical review letters 70(20), 3031 (1993) machine intelligence 27(8), 1334–1339 (2005)
17. Hsu, C.W., Lin, C.J.: A comparison of methods for mul-
ticlass support vector machines. IEEE transactions on
Neural Networks 13(2), 415–425 (2002)

A Chaos-Based Complex Micro-Instruction Set For Mitigating Instruction Reverse Engineering

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

A Chaos-Based Complex Micro-Instruction Set For Mitigating Instruction Reverse Engineering

Caricato da

Copyright:

Formati disponibili

Noname manuscript No.

(will be inserted by the editor)

A Chaos-based Complex Micro-Instruction Set for Mitigating

Received: date / Accepted: date

and p − M OS = 0.12 µm/60 nm. The transistor sizes

Table 1: Evolution of chaotic output with iterations (VC

Fig. 4: Evolution through successive iterations for dif-

Fig. 5: Worst-case delay changes with bifurcation pa-

Fig. 3: Bifurcation diagram of the chaotic oscillator.

Fig. 8: Percentage of total variance in the first few prin-

Fig. 7: Schematic for implementation of 8 Boolean func-

5 Adversarial Model Fig. 9: Sum of difference of means among the instruc-

In the past, differential power has been used to correlate and

5.3 Instruction Classification 5.3.2 Support Vector Machine (SVM)

common among the instances is assigned to x. In kNN 5.3.3 Decision Tree(DT)

Category Sub-category Dimensionality Reduction Algorithm

Instruction Matched Class (%)

Table 4: Different configurations for instruction set used in cross-obfuscation

in configuration 2 to be confused during classification

Category Data Reduction Configuration Pair

Instruction Matched Class (%)

Category Sub-category Dimensionality Reduction Algorithm

Instruction Matched Class (%)

Acknowledgements 18. Jakimoski, G., Kocarev, L.: Chaos and cryptography:

Potrebbero piacerti anche