Sei sulla pagina 1di 22

Automatic classification of the dynamical behavior of

one-dimensional binary cellular automata by means of the


spectrum of neighborhood configurations
Marcelo Arbori Nogueira1 , Pedro Paulo Balbi de Oliveira1
1
Universidade Presbiteriana Mackenzie – Faculdade de Computação e Informática
2
Pós-Graduação em Engenharia Elétrica e Computação
Rua da Consolação 896, Consolação, 01302-907 São Paulo, SP - Brazil
marcelo.arbori@gmail.com; pedrob@mackenzie.br

Abstract. Cellular automata present great variability in their temporal evolu-


tions due to the number of possible rules that create them and to the exponential
number of initial configurations. As such, the possibility of automatically clas-
sifying their generic dynamical behavior would be of great value when studying
general properties of their dynamics. By relying on elementary cellular au-
tomata (one-dimensional binary rules, with 3 cells in the neighborhood), and
considering their temporal evolution as binary images, we created a texture de-
scriptor of the images – based upon the neighborhood configurations of the cells
in the corresponding temporal evolutions – so that it could be associated to each
dynamical behavior class, following Wolfram’s classical classification scheme.
It was then possible to predict the rule class of an arbitrary temporal evolution
of an elementary rule in a more effective way than others in the literature, both
in terms of accuracy and computational cost. When applying the classifier to the
larger space with neighborhood containing 4 cells, accuracy decreased, down
to slightly more than 70%. Nevertheless, the result shows that the classifier is
still able to provide some information on the dynamics of an unknown larger
space, with reduced computational cost.

1. Introduction
Originally proposed as a computational model describing a replicating machine
[Neumann 1966], cellular automata (CAs) are both computational and mathematical ob-
jects that have been used as simulation models in many phenomena related to biology,
physics and social sciences ([Green 1990], [Smith 1994], [Wolfram 2002]). The structure
of a cellular automaton relies on a set of cells organized as a d-dimensional regular grid,
each cell being able to take on one of the states from a finite set. As a discrete dynamical
system, the values of the cells change when applying their transition function (or rule)
to their neighborhood, defined by the cell itself and its local vicinity, and the result of
applying the function for a finite time leads to its temporal evolution [Sarkar 2000].
The set of possible rules is a discrete space defined by all rules sharing the same
characterization. A rule defines the dynamical behavior of a celular automaton, and for
each one, its typical dynamics can be obtained. Among the possible rules of each space,
to delimit those of interest to simulations of some phenomena or that present a target
dynamical behavior is a difficult task due to the amount of possible rules in each space,
which increase exponentially with the number of cells in the neighborhood. To make
things worse, in general it is not possible to predict the future state of the CA without
applying the rule for an arbitrary time. The possible initial conditions make the possibility
by predicting the future state of the system even more difficult.
The initial condition of the system is the configuration of its cell states. In the
case of 3 states and configurations of 83 cells, for instance, there are 383 possible initial
conditions, and not necessarily all of them results would yield the same dynamics. How-
ever, for most of the possible initial conditions the application of a specific rule results
in typical behavior, so that the classification of the dynamical behavior of a rule always
relies on the associated typical behavior of its temporal dynamics. Naturally, not all ini-
tial conditions lead to the typical behavior of the rule, which can lead to errors. It is then
necessary to generate a representative set to derive the typical dynamical behavior of a
rule [Wolfram 2002].
The widely studied rule space, mainly by Wolfram [Wolfram 1984] is known as
elementary space which is a one-dimensional binary cellular automata, whose neighbor-
hood contains 3 cells (or radius 1, meaning a cell together with the two next-nearest cells,
one at each side). By this configuration, the elementary space has 256 possible rules,
and each has been classified under Wolfram’s scheme, based upon 4 possible classes
[Wolfram 1984]. Our target here is, first, to obtain a way to the automatic classification of
the temporal evolutions generated by the rules of the elementary space; then, once the lat-
ter has been achieved, to apply the same technique in temporal evolutions of larger spaces,
with the aim of establishing a first classification of a space whose dynamical features are
unknown.
In order to go about that, we use image texture analysis, over the images associated
to the rules’ temporal evolutions, to obtain a sorting strategy based on the spectrum of
neighborhood configurations, that is, the frequency of neighborhood configurations ex-
hibited by a temporal evolution, for each cell concerned. By constructing a data set based
on these spectra, it was possible to associate a particular evolution, presented to the clas-
sifier, to a rule of the elementary space, thus predicting the rule’s dynamical behavior in a
simple way. The accuracy of the classifier was compatible with other classification strate-
gies described in the literature, in the case of the elementary space ([da Silva et al. 2016],
[Machicao et al. 2018]) and superior to Wuensche work [Wuensche 1998].
By applying the same classification technique to the larger space composed of
neighborhoods with 4 cells (radius 1.5), a great sensitivity to the local conditions of the
image was observed. This fact caused the accuracy to classify the radius-1.5 space to be
inferior to that obtained for the elementary space, but still allowing to obtain information
referring to that unknown, larger space.
In the next section, cellular automata and their classification according to Wolfram
are described, delimiting both the target classification used in this paper and the type
of cellular automaton spaces we are accounting for. Section 3 addresses image texture
analysis and how to use it in the classification of cellular automata. After that, in Section 4,
the results obtained with the classification proposed method have their accuracy analyzed.
Finally, the paper concluding remarks are made in Section 5.
2. Cellular Automata
A CA consists of the lattice, a linear sequence of cells (a0 , a1 , a2 , ..., al ) that can be in one
of the states of a finite set (Σ). The change in the state of the central cell depends on the
cell itself and the states of the neighboring cells, which are contained within a radius (r)
around the central cell. They are the input of the function that returns the new state of
the central cell, and can be described as: at+1
i = ω(ai−r , ai−r+1 , ..., ai , ..., ai+r−1 , ai+r ),
where i is the position of the central cell, r is the radius, and t is the discrete time the
update occurs [Martinez 2013].
The ω transition function is the CA rule, a set of state transitions from the state of
the neighborhood. For example (110101001) ⇒ 1 is a transition from the 9-cell neighbor-
hood configuration, where the central cell changes to 1 in the next iteration. Table 1 is an
example of a rule whose neighborhood has 3 cells, and the update is synchronous over the
lattice of cells. For example, a 7-cell lattice with configuration 0100110 at time t, would
have the third cell changed to 1 due to the fifth transition from Table 1 [Wolfram 2002].

Table 1. Example of the rule table of an elementary rule.


Neighborhood Transition
(0, 0, 0) 0
(0, 0, 1) 1
(0, 1, 0) 1
(0, 1, 1) 1
(1, 0, 0) 1
(1, 0, 1) 0
(1, 1, 0) 0
(1, 1, 1) 0

A rule has a finite number of state transitions, one for each possible neighborhood,
with m = 2r + 1 cells. If k = |Σ| is the size of the set of possible states, then k m is the
number of possible configurations. And since each neighborhood configuration can tran-
m
sition to any of the symbols in Σ, k k defines the size of the rule space [Martinez 2013].
By organizing the neighborhoods of state transitions as numbers in base k, with
the smallest number at the top of the table and the larger at the bottom, the values of the
transitions form a k-ary pattern, that is used to identify the CA rule. This can be seen
in Table 1, which shows 3-cell neighborhood configurations in the first column and the
corresponding output bit in the second; by reading the output bits from bottom to top, they
form the number 30 in decimal, which is that rule’s (Wolfram) number [Wolfram 2002].
By applying the rule on the lattice for a given number of time steps, a binary matrix
is obtained with the number of columns equal to the size of the lattice, and the rows being
equal to the discrete time at which the rule has been applied. By making a substitution
between the states and colors, assuming that 0 represents white and 1 the color black, a
visual pattern emerges that is characteristic of each rule; see Figure 1 [Wolfram 2002].
In addition to the elements already quoted, the visual pattern that arises when
applying the CA rule over a period of time, also depends on the configuration space being
finite or infinite; in the case of finite configurations, the boundaries of the lattice can be
Figure 1. Temporal evolution of elementary rule 30 on a lattice with 50 cells,
executed by 100 iterations.

fixed or periodic. For each case the visual pattern that arises can be completely different.
For present purposes, all configurations are finite and the boundary condition is periodic
contour (in other words, the configurations are cyclic) [Wolfram 2002].
The most widely studied CA family is the elementary cellular automata (ECA),
where Σ = {0, 1}, k = 2, r = 1 and m = 3. As such, there are 8 possible neighborhood
configurations and a total of 256 rules, with well-known dynamical behaviors. Another
space considered in this work is the one with Σ = {0, 1}, k = 2, r = 1.5 (radius-1.5
space) and m = 4. Here, we focus on Wolfram’s classification scheme [Wolfram 2002]
for both spaces. Section 2.1 discusses this classification in more detail.

2.1. Wolfram Classification for the elementary cellular automata

The four classes defined by Wolfram [Wolfram 1984], for any CA space, the elementary
space in particular, is based on the visual pattern that each rule most typically gener-
ates along their temporal evolution. Such typical behaviors are separated into 4 classes,
namely:

1. Evolution leads to a homogeneous state ( with only 1s or 0s, for binary CAs);
2. Evolution leads to a set of simple and stable periodic structures;
3. Evolution leads to a chaotic pattern;
4. Evolution leads to complex, sometimes long-lasting, localized structures.

Figure 2 gives examples for each of the classes. From left to right, the first two
temporal evolutions are of class 1, the third is class 2, and the two lower ones are, respec-
tively, classes 3 and 4.
Figure 2. Examples of temporal evolution for each Wolfram class.

Something to be considered related to the classification of binary CAs is the fact


that it is not possible to classify them completely, according to Culik [Culik and Yu 1988].
The great variability of possible results for the temporal evolution due to the initial con-
dition, results in cases of misclassification of the underlying rule; this is exemplified in
Figure 3 that shows the class 3 (chaotic) rules 60, 90, 102, 153, 155 and 165 lead to homo-
geneous dynamics after a transient period, which would lead them to be classified as class
1. This particular situation occurs even for random configurations, but as along as lattice
size be a power of 2; see Figure 4, which shows the temporal evolution of elementary rule
60 on a lattice with 256 cells, displaying a homogeneous configuration from time step
t = 255 (the same being true for rules 102, 153 and 195, whereas for 90 and 165, the
homogeneous configuration occurs from time step 127).
Figure 3. Chaotic rules of the elementary space with homogeneous temporal evo-
lution after transient. For particular configurations, these chaotic rules have dif-
ferent dynamics from their typical behavior.
Figure 4. Temporal evolution of elementary rule 60 (a chaotic rule), from a random
initial configuration, with homogeneous dynamics from time step t = 255.

All this shows that, over the set of possible initial conditions, atypical temporal
evolutions may be observed; and this is a complication. But not as dramatic as the expo-
nential increase in the number of rules of a space; for instance, from the 256 ECA rules,
more than 4 billion rules are available in the radius-2 space.
In spite of the impossibility of obtaining a classification without any indecisions
about the class of the rule [Culik and Yu 1988], Wuensche ([Wuensche 1998]) describes
the classification of ECAs using various statistical measures, including entropy of the
temporal evolution, defining the class according to the level of entropy obtained by each
ECA rule. However, the lack of analysis of the accuracy of this method does not allow to
attest its effectiveness; also, its completely ad-hoc nature precludes any generalization to
larger spaces.
The strategy to be described here for automatic classification involves mapping
any binary temporal evolution of the elementary space to an image, and training a classi-
fier on them, since the classification of the elementary space is known. To this end, the
problem was tackled by texture analysis, defined from the frequencies of neighborhood
configurations of the cells in the temporal evolution, which generates a sort of spectrum
signature of the temporal evolutions. Section 3 discusses how the actual approach was
applied. But before, we need to discuss how to extract information about textures and
how to analyze them.

3. Texture analysis
In an image, repeating variations of colors and intensity form the textures, whose local
attributes may vary slowly along the image or remain partially periodic. The texture itself
is formed by sub-patterns, with specific positioning rules in the image, and these sub-
patterns are formed by fundamental units called primitive, characterized by geometric
shapes or local patterns of pixels. The textures provide important information about the
arrangement of the elements of the image and also characterize the surface of objects, en-
abling their classification, which becomes simpler if the textures are well differentiated.
As a visual element, a texture can be described in qualitative terms, but the analysis of tex-
tures seeks to quantify the qualitative description, thus allowing to use it in various areas,
from evaluation of medical images up to automatic detection [Acharya and Ray 2005].
As defined in Section 2, the temporal evolution is turned into images by replacing
the states by colors, in the present case, black and white. This way of seeing the temporal
evolution allows to use established techniques to extract image characteristics. In Dong-
Chen [chen He and Wang 1990], texture analysis was used to classify satellite images
with 256 levels of gray, using an algorithm to extract characteristics through textures. A
neighborhood V = {V0 , V1 , ...V8 } was taken into consideration, where Vi is one of the
8 cells around a central cell, V0 is the upper left, and the others obtained in clockwise
direction over the neighborhood. The texture unit T U = {E1 , E2 , ..., E8 } is obtained
through Equation 1, where each Ei belongs to the transformed value from the pixel Vi .

 0, Vi < V0
Ei = 1, Vi = V0 (1)
2, Vi > V0

Given the 3 possible values for the transformation and the 8 cells of the neigh-
borhood, 38 = 6561 possible configurations become possible. The transformation of
the image is given by assigning to the position of each pixel the number of the tex-
ture unit T U referring to its neighborhood, the value obtained according to Equation 2
[chen He and Wang 1990].

8
X
NT U = Ei · 3i−1 (2)
i=1

Exemplifying the algorithm, the neighborhood of a pixel is assumed to have the


following setting V = {65, 253, 173, 65, 69, 56, 228, 55, 215}, which is converted to the
texture unit T U = {2, 2, 1, 2, 0, 2, 0, 2}, whose decimal value for the central pixel is
4931. By doing the same for each pixel a transformation of the image is obtained, where
the intensity of each pixel varies from 0 to 38 − 1, thus being possible to construct a
relative frequency vector for each possible value NT U and associate this vector to the
known image under classification [chen He and Wang 1990].
However, there are other ways to extract textural characteristics from an image.
In statistical approaches of texture analysis, the starting point is the extraction of image
characteristics from the neighborhood around the pixel. An algorithm analogous to that
used by Dong-Chen [chen He and Wang 1990] is the algorithm Local Binary Pattern
(LBP), which allows to extract information about changes in brightness intensity around
a pixel; this is made by applying Equation 3 to the pixels around the central pixel, clock-
wise. In the equation, ctj is the central pixel and cyx are the pixels in the neighborhood
[Machicao et al. 2018].
ctj − cyx ≥ 1

0,
s(ctj − cyx ) = (3)
1, ctj − cyx < 1

The neighborhood binary pattern is converted to a decimal value that is assigned to


the central pixel in the converted image. Table 2 exemplifies the LBP algorithm applied
over a central pixel with intensity 39 (division A). The neighborhood is converted to a
binary pattern (division B), which is converted to a decimal value from the upper-left
value of division B, rotating clockwise, and results in the substituted value 242 in the
same position as the central pixel (division C).

Table 2. Application of the LBP algorithm around a pixel with 39 inten-


sity in grayscale.
103 109 188
A 34 39 184
205 38 28
1 1 1
B 0 1
1 0 0

C 242

From the resulting image it becomes possible to generate a histogram to charac-


terize the original image by means of Equation 4, where k is the maximum value of the
possible gray levels; with radius 1, k may vary between 0 and 255 (28 gray levels). So,
q(x, y) = 1 if x = y or 0, otherwise.

I−1 X
X J−1
H(k) = q(LBPp,r (i, j), k) (4)
i=0 j=0

It is necessary to note that, when applying Equation 3 on a binary image, the LBP
algorithm does not change the neighborhood binary pattern if the central pixel is 0. And
in the case the pixel to the center is 1, the binary pattern is null for all pixels. As a result,
all 0s of the temporal evolution are replaced by the binary pattern of the neighborhood,
while the 1s are replaced by 0s.
The Gray-level Co-occurrence Matrix is another algorithm used to extract image
textures. In this case, two pixels are related adopting an offset from position (x, y), with
an angle α, and constructing a square matrix with dimensions G, the gray levels present
in the neighborhood, thus making P (i, j|∆x, ∆y), as in Equation 5, with W and Q being
defined, respectively, by 6 and 7.

P (i, j|∆x, ∆y) = W × Q(i, j|∆x, ∆y) (5)


1
W = (6)
(M − ∆x)(N − ∆y)

N −∆y M −∆x
X X
Q(i, j|∆x, ∆y) = A (7)
n=1 m=1

With A = (1 if f (x, y) = i and f(x+∆x, y+∆y) = j, 0 otherwise), one is chosen,


the other defined within a radius of distance and a certain angle to the calculation, and the
process is repeated for each pixel of the image. This algorithm measures how much two
pixels are related, measuring the occurrence of the same shade of gray in the two regions
of the image [Albregtsen 2008].
From this information about the texture one can analyze the image with a focus
on several objectives, where classification is one of them. Section 3.1 presents our pro-
posal for extracting texture information in the binary images associated with the temporal
evolutions of the CAs.

3.1. Texture analysis and classification of cellular automata


In Dong-Chen [chen He and Wang 1990], satellite images were classified, associating
them with the frequency vector relative to the neighborhood configurations of a pixel.
With the algorithm exposed there, it was possible to obtain an average accuracy of 98.4%
and 99.6% in the classification, depending on the group of images used.
From a data base containing temporal evolutions and their dynamical classes,
Núbia [da Silva et al. 2016] proposed to extract characteristics of the temporal evo-
lution using a variation of the Local Binary Pattern Variance (LBPV) algorithm
[Guo et al. 2010], thus creating a characteristic vector with 10 components. They also
proposed the use of the Fourier spectrum [da Silva et al. 2016] in the extraction of char-
acteristics, from which it was possible to create vectors with 64 components. The clas-
sification was then made by creating a series of characteristic vectors, and compaired
the results with the outputs of the k-nearest neighbor classification algorithm (k-NN)
[Cover and Hart 2006]. Considering only the first closest neighbors (i.e., 1-NN), the el-
ementary space was classified using LBPV and the Fourier spectrum, which led to an
accuracy of 96.35% when the LBPV was used and 99.42% for the Fourier transform.
However, the temporal evolution of the CAs discussed here already presents a
binary configuration in the neighborhood of a pixel. Notice that taking into account the
color levels of the binary temporal evolution is much simpler than the approach used in
the satellite images in [chen He and Wang 1990], as it is not required to extract binary
patterns, since they are already present in the vicinity of the pixel. This perception is the
basis of the proposed classification described in Section 3.2.

3.2. An approach to classification of one-dimensional binary cellular automata


using texture analysis
Section 3 discussed texture analysis in images and the role that the primitives have in the
formation of textures. Here, we propose to extract information from the neighborhood
of the pixel (cell), similarly to the LBP algorithm, but without the need to extract binary
patterns, since the temporal evolution is binary; furthermore, the neighborhood configu-
rations around the central cell can be readily converted to decimal values, resulting in an
image with 512 color levels, each color representing a neighborhood configuration. Cal-
culating the proportion of each color in the image and ordering them in ascending order
of their decimal values allows us to construct a neighborhood configuration spectrum,
associating each color with as its frequency in the image. By organizing the proportions
in the form of a vector, from an arbitrary number of temporal evolution with their re-
spective spectra, the sum and normalization of the vectors result in a single spectrum
representative of the rule that generated the temporal evolution. By generating a vector
for each of the 256 rules, they constitute the information required to predict the class of
the generating rule submitted to the classifier.
Detailing the algorithm, it can be said that the temporal evolution also has textures
and these are formed by primitives, namely, the neighborhood configuration of a cell over
time. In the case of the elementary CAs, we can set the vicinity of any cell within 3 time
steps t, t + 1 and t + 2, thus obtaining a region also defined by the radius around the cell.
Figure 5 exemplifies the idea; the primitive defined for the analysis of temporal evolution
is a square region of the temporal evolution, containing 9 cells.

Figure 5. Primitive detached in red from the section of a temporal evolution in


the neighborhood containing 3 cells, observed for 3 time steps.

The configuration of the primitive can be converted to an integer by doing accord-


ing to Equation 8, where ctj represents the cells of the primitive. In the case of ECAs, 512
different values are obtained between 0 and 29 − 1 = 511. Thus, like other characteristic
extraction algorithms, an image is generated where the value of each original pixel is re-
placed by the value extracted from the neighborhood. In the proposed approach, instead
of the cell state, the decimal value for the neighborhood binary configuration is used.
From this object we define ~sz = (s0 , s1 , s2 , ..., s511 ), which represents the spec-
trum referring to the temporal evolution z. For instance, running elementary rule 0 for 3I
steps of time, with 2I transients, over a lattice of size J, yields a temporal evolution of
dimensions IJ containing only the primitive 0 and, as a result, ~sz = (IJ, 0, 0, ..., 0). For
other rules, the sum for each component in ~sz will be distinct.
0 +2r jX
tX 0 +2r

s= ctj · 2(j−j0 )+(t−t0 )·(2r+1) (8)


t=t0 j=j0

The vector ~tn , Equation 9, being the sum of the histograms ~sz , can be normalized
by T (Equation 10), resulting in the vector of frequencies associated with the generating
rule n of the Z temporal evolutions (Equation 11).

Z
X
~tn = ~sz (9)
z

511
X
T = ~tni (10)
i=0

1
f~n = · ~tn (11)
T

The set F = {f~0 , f~1 , f~2 , ..., f~255 } is formed by the spectra associated with each
rule of the elementary space, which are then used in the classification. However, the fact
that the number of initial conditions Z required for f~n has a representative configuration
of the rule n is a problem that is imposed. Nevertheless, the total number of possible
initial conditions imposes a difficult-to-treat computational problem due to the size of a
relevant sample. A lattice with 100 cells allows 2100 distinct initial conditions, so that
even 1% of this value means a number of initial conditions of the order of 1028 ; naturally,
to generate spectra at this order of magnitude demands considerable time for present
computer system.
If fixing the quantity for a relevant sample implies such an impracticable solution,
it is possible to establish a stop condition to generate the spectra. As the vector ~sz is
added to ~tn , that contributes less and fewer, and this negative variation can be used to
define that additional contributions of ~sz are irrelevant, thus leading to the suspension of
the process. Thus, a vector p~, Equation 12, formed by the ratio of the components of the
vectors ~sz and ~tn , has its norm increasingly smaller as ~sz is added to ~tn . Since the initial
conditions are random, it is not possible to predict how each component varies over time,
but it is possible to assert that the norm of p~ should decrease over time. By establishing
a minimum (e) desired error, the process of summing the vectors is interrupted when
|~p| ≤ e.
 
s0,z s1,z s511,z
p~ = p(~sz , ~tn ) = , , ..., (12)
t0,n t1,n t511,n

However, it is necessary to note that when using random initial conditions to gen-
erate the temporal evolution of a rule, the norm obtained can be less than or equal to the
error, with a very small sample; so it is necessary not only to ensure that the sample of an
arbitrary minimum size be chosen for all rules, but, also, that the convergence of the norm
is smooth. To resolve both points, the norms of a single instance of p~ was not adopted, but
rather, a moving average of the minimum arbitrary amount of initial conditions, so that
the interruption of the process would occur when the moving average would become less
than or equal to the error e.
Given a spectrum from set F , the rule used to generate it is know and since the
classification of ECA is well known, the association with its rule is direct. So, to classify a
dinamical behavior of a temporal evolution, just take its spectrum and choose other from
set F , with the smallest Euclidean distance between than. Due the spectra from F has
information about the rule and class, the classification is direct.

4. The classification process and the results

The method proposed in Section 3.2 was applied to each elementary rule, in lattices con-
taining 100 cells, and applying the rule for 300 time steps and transient of 200 time steps.
Temporal evolutions were generated from random initial conditions, until the error thresh-
old of 10−5 was reached for each rule, and from this set of temporal evolutions a neigh-
borhood configuration spectrum was defined for each rule, stored in the set F .
Figure 6 shows a distance diagram of set F ’s elements with gray intensity indicat-
ing the distance between the spectra (ni , nj ), referring to the rules (n) of the elementary
space. Figure 7 shows how even for equivalent rules the spectra are different. These char-
acteristics of F helps to obtain low ambiguity when comparing spectra of the temporal
evolutions using the Euclidean distance, metric, used by classifier.

Figure 6. Diagram of the distances between the spectra of the rules of the ele-
mentary space. Each pixel (x, y) is the distance information between rules x and
y; the largest distances are represented by darker tones and the lower ones by
light tones.
Comparison between the spectra of ECA rules 54 and 147

0.12 54
147
0.1

8 · 10−2

6 · 10−2

4 · 10−2

2 · 10−2

0
0

50

100

150

200

250

300

350

400

450

500
Figure 7. Comparison of the generated spectra for rules dynamically equivalent
to rule 54, using random initial conditions for each one.

The quality of the described classifier was evaluated using the confusion matrix
(C) containing the 4 classes of Wolfram [Wolfram 1984]. From this table it is possible to
count the predictions that are true positive (VP), those where there is a correspondence
between the predicted class, and the real one. The true negative (TN) are those correctly
predicted not to be of a class, false positive (TP) refers to predictions made for a class
distinct from the real, while false negatives (FN) are those predicted erroneously as not
being of a class that would be correct. Table 3 summarizes how to calculate such variables,
where x represents the class to be measured.

Table 3. Confusion matrix - Counting predictions.

True Positive T Px = cxx

P4 P4
True Negative T Nx = i=1 j=1 cij , se i 6= x e j 6= x

P4 P4
False Positive F Px = i=1 j=1 cij , se i 6= x e j = x

P4 P4
False Negative F Nx = i=1 j=1 cij , se i = x e j 6= x

From these counts, the global accuracy (GA) of the classifier is calculated. In
relation to the classes, the true positive rate (TPR), true negative rate (TNR), positive
predictive value (PPV), negative predictive value (NPV) ) and the accuracy (AC) of
each class can also be worked out. These rates are individual, the closer to 1 the better
[Fawcett 2006] [Metz 1978]. Table 4 indicates how to calculate such metrics, where x
represents the class to be measured.

Table 4. Confusion matrix - Maximizable indices.

× ( C)−1
P4 P
Global accuracy GA = i=0 cii

T Px T Px
True positive rate T P Rx = Px
= T Px +F Nx
= 1 − IF Nx

T Nx T Nx
True negative rate T N Rx = N
= T Nx +F Px
= 1 − F P Rx

T Px
Positive predictive value P P Vx = T Px +F Px

T Nx
Negative predictive value N P Vx = T Nx +F Nx

T P x +T N x T P x +T N x
Accuracy AC x = Px +Nx
= T P x +T N x +F P x +F N x

The false positive rate (FPR), false negative rate (FNR), false discovery rate
(FDR) e false omission rate (FOR) are metrics that indicate the precision of the
classifier, as nearer to 0 as these rates are [Banerjee and Bhadury 2009] [Storey 2002]
[Patil et al. 2018]. Table 5 summarizes how to calculate such metrics, where x represents
the class to be measured.
Table 5. Confusion matrix - Minimizable indices.

F Px F Px
False positive rate F P Rx = Nx
= F Px +T Nx
= 1 − T N Rx

F Nx F Nx
False negative rate F N Rx = Px
= F Nx +T Px
= 1 − T P Rx

T Px
False discovery rate F DRx = F Px +T Px
= 1 − P P Vx

F Nx
False omission rate F ORx = F Nx +T Nx
= 1 − N P Vx

T Nx
Negative preditive value N P Vx = T Nx +F Nx
= 1 − F ORx

The best way to use the strategy discribe in Section 3.1 was generate random ini-
tial condition for each ECA rules. The confusion matrix of the Table 6, with the actual
and predicted classes, respectively in rows and columns, shows the error counts and cor-
rectness when classifying the elementary space. Table 7, contains the metrics, derived
from the confusion matrix.

Table 6. Confusion matrix - Elementary space classification.


Confision matrix
Classes 1 2 3 4
1 12456 0 0 0
2 276 110881 28 3
3 0 3 22633 0
4 0 147 0 8848
Global accuracy: 0.9971
Table 7. Metrics from confusion matrix - Elementary space classification.
Results
Class 1 2 3 4
True positive 12456 110881 22633 8848
True negative 142543 43937 132611 146277
False positive 276 150 28 3
False negative 0 307 3 147
True positive rate 1.0000 0.9972 0.9997 0.9837
True negative rate 0.9981 0.9966 0.9998 1.0000
Positive preditive value 0.9783 0.9986 0.9988 0.9997
Negative preditive value 1.0000 0.9931 1.0000 0.9990
Accuracy 0.9982 0.9971 0.9998 0.9990
False positive rate 0.0000 0.0028 0.0001 0.0163
False negative rate 0.0019 0.0034 0.0002 0.0000
False discovery rate 0.0217 0.0014 0.0012 0.0003
Falsa omission rate 0.0000 0.0069 0.0000 0.0010

At this point, we provide a comparison between the entropy based proposal from
[Wuensche 1998] with ours, based on the spectra of neighborhood configurations. In
order to do that we adapted the texture classification strategy, so that the entropy variation
over time was used in place of the neighborhood configurations, to classify the ECA.
Table 8 summarizes the results. They clearly show the lower accuracy values of the
entropy based approach, compared to those from our method. Notice also the inability
of that approach to correctly classify class 3 and class 4 rules.

Table 8. Confusion matrix - classification by entropy - elementary space.


Confusion matrix
Classes 1 2 3 4
1 12000 0 0 0
2 446 97531 12 0
3 0 12120 871 0
4 0 4791 204 0
Global accuracy: 0.8627

Table 9 show details about the entropy based classification, where it displays its
lack of efficacy for classifying classes 3 and 4, since the values for true positive are
far from ideal in this case. The apparent contradiction of high values for the individual
accuracy for these classes is due to the true negative value used in the equation.
Table 9. Metrics from confusion matrix - classification by entropy - elementary
space.
Results
Class 1 2 3 4
True positive 12000 97531 871 0
True negative 115529 13075 114768 122980
False positive 446 16911 216 0
False negative 0 458 12120 4995
True positive rate 1.0000 0.9953 0.0670 0.0000
True negative rate 0.9962 0.4360 0.9981 1.0000
Positive preditive value 0.9642 0.8522 0.8013 0.0000
Negative preditive value 1.0000 0.9662 0.9045 0.9610
Accuracy 0.9965 0.8643 0.9036 0.9610
False positive rate 0.0000 0.0047 0.9330 1.0000
False negative rate 0.0038 0.5640 0.0019 0.0000
False discovery rate 0.0358 0.1478 0.1987 0.0000
Falsa omission rate 0.0000 0.0338 0.0955 0.0390

The values presented in the tables 8 and 9 indicate the inefficiency of the entropy
variation of temporal evolutions in the classification, since it showed a high imprecision
rate for the ACE. But the strategy using spectra show great capacity to classify elementary
space, so the same strategy was applied to radius-1.5 space. But was necessary performed
a visual classification of the entire space, so that it could serve as a reference when apply-
ing the ECA based classification methodology to this larger space; the data is available at
https://bit.ly/2VwVnhV. For the predictions we used the same set F used before. How-
ever, the same accuracy and performance of the classifier for the radius-1.5 space was not
observed. It was noticed that the calculation of the distance between the spectra is very
influenced by the frequency with which the neighborhood configurations takes place in
the temporal evolution, which ends up causing the corresponding rules to be misclassified,
depending on the frequency with which each configuration appears in temporal evolution.
The rule 1572 of the radius-1.5 space, for example, generates temporal evolutions
with scarce amounts of vertical lines, which ends up leading the rule to be classified as
class 1, when in fact it is class 2. The temporal evolution of Figure 8 leaves the spectrum
of the rule closest to ECA rule 224, as shown in Figure 9, causing a classification error.
Figure 8. Time evolution of rule 1572 classified as 1, the correct being class 2.

Espectrum comparation

1 224 (r:1.0)
1572 (r:1.5)
0.8

0.6

0.4

0.2

0
0

50

100

150

200

250

300

350

400

450

500

Figure 9. Comparison of the spectra between radius-1.5 rule 1572 and ECA
rule 224, evidencing the impossibility of differentiation between them, using Eu-
clidean distance.

Such imprecision was repeated for other rules, causing considerable impact on the
accuracy of classifying the radius-1.5 space, as can be seen in the confusion matrix and
other metrics resulting from the process, respectively in Tables 10 and 11.
Table 10. Confusion matrix - Space with radius 1.5
Confusion matrix
1 2510 206 0 0
2 0 36670 665 879
3 2 9379 9900 3535
4 0 1132 206 452
Global accuracy: 0.7558

Table 11. Metrics from confusion matrix - Space with radius 1.5
Results
Classe 1 2 3 4
True positive 2510 36670 9900 452
True negative 62818 16605 41849 59332
False positive 2 10717 871 4414
False negative 206 1544 12916 1338
True positive rate 0.9242 0.9596 0.4339 0.2525
True negative rate 0.9999 0.6078 0.9796 0.9308
Positive preditive value 0.9992 0.7738 0.9191 0.0929
Negative preditive value 0.9967 0.9149 0.7642 0.9779
Accuracy 0.9968 0.8129 0.7896 0.9122
False positive rate 0.0000 0.3922 0.0204 0.0692
False negative rate 0.0820 0.0326 1.1991 0.2750
False discovery rate 0.0010 0.2262 0.0809 0.9071
False omission rate 0.0033 0.0851 0.2358 0.0221

When comparing the results of classifying the elementary space and that of the
space with radius-1.5 using spectra generated from the rules of the ECA, this forces the
conclusion that applying the classifier to larger spaces does not give as good results as
hoped for, due to the local characteristics of the texture analysis. However, when taking
into account the results the classifier offers – notwithstanding the likely inaccuracies when
using a visual classification as reference – the automatic process allowed to know the
correct class for more than 70% of the cases, which is an interesting percentage for a
larger space. However, for even larger spaces, a greater reduction of accuracy is to be
expected, but, still, possibly useful up to a limit in radius.

5. Concluding remarks
The classification approach proposed herein reached great accuracy compared with
[Wuensche 1998], as we could see, but similar accuracy to other works of the literature,
in particular, similar to [da Silva et al. 2016] with accuracy 99.42% in the best case, while
in this work was 99.71%, a diference of only 0.64%. However, the smaller set of spectra
used in this work offers much less computational effort when defining the prediction of
the class associated with the temporal evolution submitted to the classifier, due to the set
used for comparison being much smaller than the one used by [da Silva et al. 2016].
To apply the classifier to larger spaces showed difficulties with respect to the local
characteristic of the texture analysis. In addition to this technical aspect, the visual classi-
fication of the radius-1.5 space offers a reference basis that does not currently exist even
with the inaccuracies implicit to manual classification; but notice that these same inaccu-
racies impact on the accuracy for the classification of the space, which partly explains the
misclassifications. Nevertheless, the technique proved useful in obtaining a first approxi-
mation of the classification of an unknown space. Also, our proposal is not limited to any
rules space, since any binary matrix can be submitted to the classifier, and its accuracy can
be improved if the set F is fed back with new spectra of temporal evolutions not present
in its generation.
There is an important connection of this work with neural networks, because, in
order to classify an image using neural networks it is necessary to extract characteristics
and the analysis of texture can be one of the techniques of extraction. And when referring
to convolutional neural networks, the work of extracting features is done automatically by
the network itself locally, as was done here by texture analysis.
An evaluation of how accuracy varies as the classifier is used under larger spaces
and an optimal way to feed back the process and improve accuracy are important points
to address in the future. Correlating and comparing the performance observed here with
that obtained by a convolutional neural network is also interesting due to the similarities
between the two techniques, since both extract information locally to the pixels of the
images. Comparing the two approaches is certainly a tempting follow-up of the present
work.

6. Acknowledgements
We are grateful to financial support from: Instituto Presbiteriano Mackenzie; CAPES
- Brazil: STIC-AmSud project CoDANet 88881.197456/2018-01, and PrInt project no.
88887.310281/2018-00; and CNPq - Brazil.

References
[Acharya and Ray 2005] Acharya, T. and Ray, A. K. (2005). Image Processing - Principles
and Applications. Wiley-Interscience, New York, NY, USA.
[Albregtsen 2008] Albregtsen, F. (2008). Statistical texture measures computed from gray
level coocurrence matrices.
[Banerjee and Bhadury 2009] Banerjee, I. and Bhadury, T. (2009). Self-medication practice
among undergraduate medical students in a tertiary care medical college, West Bengal.
Industrial Psychiatry Journal, 18(2):127–131.
[chen He and Wang 1990] chen He, D. and Wang, L. (1990). Texture unit, texture spec-
trum, and texture analysis. IEEE Transactions on Geoscience and Remote Sensing,
28(4):509–512.
[Cover and Hart 2006] Cover, T. and Hart, P. (2006). Nearest neighbor pattern classification.
IEEE Trans. Inf. Theor., 13(1):21–27.
[Culik and Yu 1988] Culik, II, K. and Yu, S. (1988). Undecidability of CA classification
schemes. Complex Syst., 2(2):177–190.
[da Silva et al. 2016] da Silva, N. R., Baetens, J. M., da Silva Oliveira, M. W., Baets, B. D.,
and Bruno, O. M. (2016). Classification of cellular automata through texture analysis.
Information Sciences, 370-371:33 – 49.
[Fawcett 2006] Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recogn. Lett.,
27(8):861–874.
[Green 1990] Green, D. G. (1990). Cellular automata models in biology. Math. Comput.
Model., 13(6):69–74.
[Guo et al. 2010] Guo, Z., Zhang, L., and Zhang, D. (2010). Rotation invariant texture
classification using LBP variance (LBPV) with global matching. Pattern Recogn.,
43(3):706–719.
[Machicao et al. 2018] Machicao, J., Ribas, L. C., Scabini, L. F., and Bruno, O. M. (2018).
Cellular automata rule characterization and classification using texture descriptors.
Physica A: Statistical Mechanics and its Applications, 497(C):109–117.
[Martinez 2013] Martinez, G. J. (2013). A note on elementary cellular automata classifica-
tion. J. Cellular Automata, pages 233–259.
[Metz 1978] Metz, C. E. (1978). CE: Basic principles of ROC analysis. In Seminars in
Nuclear Medicine, pages 8–283.
[Neumann 1966] Neumann, J. V. (1966). Theory of Self-Reproducing Automata. University
of Illinois Press, Champaign, IL, USA.
[Patil et al. 2018] Patil, S., Nemade, V., and Soni, P. K. (2018). Predictive modelling for
credit card fraud detection using data analytics. Procedia Computer Science, 132:385
– 395. International Conference on Computational Intelligence and Data Science.
[Sarkar 2000] Sarkar, P. (2000). A brief history of cellular automata. ACM Comput. Surv.,
32(1):80–107.
[Smith 1994] Smith, M. A. (1994). Cellular automata methods in mathematical physics.
Technical report, Cambridge, MA, USA.
[Storey 2002] Storey, J. D. (2002). A direct approach to false discovery rates.
[Wolfram 1984] Wolfram, S. (1984). Universality and complexity in cellular automata.
Physica D: Nonlinear Phenomena, 10(1-2):1–37.
[Wolfram 2002] Wolfram, S. (2002). A New Kind of Science. Wolfram Media Inc., Cham-
paign, Ilinois, US, United States.
[Wuensche 1998] Wuensche, A. (1998). Classifying cellular automata automatically. Work-
ing papers, Santa Fe Institute.

Potrebbero piacerti anche