IJDIWC

IJDIWC
International Journal of
ISSN 2225-658X (Online)
DIGITAL INFORMATION AND WIRELESS COMMUNICATIONS
Volume 4 Issue 2
2014
TABLE OF CONTENTS
Original Articles
PAPER TITLE
AUTHORS
PAGES
A NEW APPROACH TO WIRELESS CHANNEL MODELING USING Divya Choudhary, Aaron L.

FINITE MIXTURE MODELS
Robinson
169
TOWARDS CARBON EMISSION REDUCTION USING ICT
184
Tiroyamodimo Mogotlhwane
INTERFERENCE ANALYSIS AND SPECTRUM SENSING OF MULTIPLE Sowndarya

COGNITIVE RADIO SYSTEMS
Meenakshi
A LICENSE MANAGEMENT SYSTEM FOR CONTENT SEPARATE Masaki
DELIVERY OVER P2P NETWORK
Iwamura
Sundar,
Inamura,
M.
Keiichi
191
202
ROBUST NONLINEAR COMPOSITE ADAPTIVE CONTROL OF

Bara Emran, Aydin Yesildirek
QUADROTOR
213
A NEW ORTHOGONAL CRYPTOGRAPHIC SYSTEM FOR DATABASE

Mohammad V. Malakooti,
SECURITY BASED ON CELLULAR AUTOMATA AND HASH
Ebrahim Akhavan Bazofti
ALGORITHM
226
DESIGNING AND IMPLEMENTING BI-LINGUAL

DICTIONARY TO BE USED IN MACHINE TRANSLATION
MOBILE Hassanin M. Al-Barhamtoshy,

Fatimah M. Mujallid
236
Fadhilah Ahmad, M Yazid M

GROUP DECISION SUPPORT SYSTEM BASED ON ENHANCED AHP Saman,
Fatma
Susilawati
FOR TENDER EVALUATION
Mohamad, Zarina Mohamad,
Wan Suryani Wan Awang
248
BUILDING AN ADVANCE DOMAIN ONTOLOGY MODEL OF

Ahlam Sawsaa, Joan Lu
INFORMATION SCIENCE (OIS)
258
International Journal of Digital Information and Wireless Communications (IJDIWC) 4(2): 169-183
The Society of Digital Information and Wireless Communications, 2014 (ISSN: 2225-658X)
A New Approach to Wireless Channel Modeling using Finite Mixture Models
Divya Choudhary1 and Aaron L. Robinson2

Department of Electrical and Computer Engineering, Christian Brothers University, Memphis, TN, USA
2
Department of Electrical and Computer Engineering, University of Memphis, Memphis, TN, USA
1
dchodhry@cbu.edu
2
alrobins@memphis.edu
ABSTRACT
This paper presents a new approach to modeling a
wireless channel using finite mixture models (FMM).
Instead of the conventional approach of using non
mixtures (single) probability distribution functions,
FMMs are used here to model the channel impulse
response amplitude statistics. To demonstrate this, a
FMM based model of Ultrawideband (UWB) channels
amplitude statistics is developed. In this research, finite
mixture models composed of combinations of
constituent PDFs such as Rayleigh, Lognormal,
Weibull, Rice and Nakagami are used for modeling the
channel amplitude statistics. The use of FMMs is
relevant because of their ability to characterize the
multimodality in the data. The stochastic expectation
maximization (SEM) technique is used to estimate the
parameters of the FMMs. The resultant FMMs are then
compared to one another and to non-mixture models
using model selection techniques such as Akaikes
Information Criteria (AIC). Results indicate that
models composed of a mixture of Rayleigh and
Lognormal distributions consistently provide good fits
for most of the impulses of the UWB channel. Other
model selection techniques such as Minimum
Description Length (MDL) and Accumulative
Predictive Error (APE) also confirmed this finding. This
selection of FMM based on Rayleigh and Lognormal
distributions is true for both the industrial as well as the
university environment channel data
KEYWORDS
Ultrawideband Communication, Small Scaling Fading,
Wireless Channel Modeling, Finite Mixture Models,
Stochastic Expectation Maximization.
1 INTRODUCTION
At various stages of design, the performance of a
communication system needs to be evaluated in the
relevant communication channel. This evaluation

can be completed most accurately through
extensive field tests in environments similar to the
ones in which the final product will be fielded.
However, field data collections can be expensive
and tedious. A cost-effective alternative is to use
software simulations of the channel [1]. Accurate
channel models are required to develop effective
software simulations. Signals traveling through the
channel experience different physical phenomena
such as diffraction, scattering and reflection. These
phenomena cause changes in the strength of the
signal as it travels from the transmitter to the
receiver. Channel propagation models predict the
average signal strength and its variability at a given
distance from the transmitter [1]. These models can
be divided into large scale and small scale models.
Large scale models characterize changes in signal
power over large distances, while small scale
models predict the rapid fluctuations in signal
strength over short distances.
In small scale fading, the received signal is given
by the vectorial addition of delayed copies of the
actual signal (multipaths). Based on this effect,
small scale fading can be characterized by the
channel impulse response. The impulse response
includes a complex gain term, an amplitude or
magnitude component which affects the signal
strength, and a phase component accounting for the
phase shift in the signal. It is well established that
the phase variations are uniformly distributed
between 0 and 2 [2]. The contribution of this
research is limited to small scale models and
specifically to the amplitude statistics of the
impulse response.
The impulse response of a wireless channel can be
represented as a discrete time series of time shifted
169
and weighted delta functions (impulse train). The
impulse response amplitudes are commonly
modeled using non-mixture models such as
Rayleigh, Lognormal, Weibull, Rice and Nakagami
probability distribution functions for specific
channel conditions. Rayleigh is widely used for
amplitude statistics modeling due to elegant
theoretical explanation and occasional empirical
justification [2]. It has been observed that Rayleigh
provides good fit when there is no single dominant
multipath present among the arriving multipaths
[3],[4],[5]. Rice distribution is used when there is
a dominant multipath present in the various arriving
multipaths when there is a line-of-sight between the
transmitter and the receiver [6],[7]. Nakagami
distribution has been observed to provide good data
fit when the multipaths arrive with large time delay
spreads [8],[9].
Even though Weibull and
Lognormal distributions lack theoretic justification
for their use in channel amplitude modeling, they
provide excellent data fit in many cases
[10],[11],[12],[13],[14],[15],[16]. In this research,
we propose a new approach to model the impulse
response amplitude statistics of a wireless
communication channels using finite mixtures.
Finite mixtures describe the probability distribution
of a random variable as a weighted sum of
component probability distribution functions. The
development of a FMM based model is
demonstrated using the ultrawideband impulse
response amplitude data. The objective behind
using FMMs is to harness the individual
contributions from multiple component statistical
channel characterizations to describe the existence
of a multi-modal distribution of the data.
Section 2 of the paper discusses basics of small
scale channel fading. Section 3 provides basics
about UWB communication and existing channel
models for UWB channels. Section 4 describes the
UWB data collection methodologies and UWB data
used in this paper for channel modeling. Section 5
presents background on the topic of FMMs and
parameter estimation for FMMs and non-mixture
models. Model selection techniques for picking the
most appropriate channel model for UWB
communication are described in section 6. Results
of UWB channel modeling are presented in section
7, while section 8 discusses conclusions of the

research conducted in this paper.
2 SMALL SCALE CHANNEL FADING
Small scale fading is caused by multipath
propagation and Doppler spread. Multipath
components are copies of the original signal that
reach the receiver after reflection from the ground
and various surrounding objects. They reach the
receiver at different time delays and phase shifts
and get added vectorially. The random phase shifts
experienced by the individual multipath
components determine the degree of attenuation or
amplification in the received signal.
The impulse response of a channel can describe the
small scale variations of the received signal based
on the multipath characteristics of the channel. The
multiple duplicates of the transmitted signal reach
the receiver with different time delays after
traversing different paths. Due to limited temporal
resolution, practical receivers cannot isolate
individual multipath components. Thus, several
multipaths
combine
constructively
and
destructively within a single resolution bin of the
receiver causing the received signal to fluctuate.
The received signal or channel output y(t) can be
expressed as the convolution of the transmitted
signal or channel input x(t) with the impulse
response of the channel h(t,) as represented in
equations 1 and 2.
y ( t ) x ( t ) * h ( t , )
y(t )
x( )h(t, )d
(1)
(2)
The multipath signals arriving at the receiver are a

series of attenuated, time delayed, and phase shifted
replicas of the transmitted signal. Thus a functional
form of the impulse response explaining such a
phenomenon is a weighted sum of time shifted delta
functions. This function is expressed in equation 3.
170
h(t , )
N 1
t, exp
i
j 2 fc i (t ) i (t , )
(
i (t ))
(3)
i 0
In equation 3, the independent variable t indicates

that the impulse response can vary with time and i
represents the multipath number. i(t,) and i(t) are
the amplitude and excess delay of the ith multipath
component. The excess delay of the ith multipath is
the arrival time of that multipath with respect to the
first arriving multipath. The phase changes due to
free space propagation and other channel
characteristics are encapsulated in the complex
exponential argument 2

, . It is
common to represent the entire phase term as a
single variable i(t,). The equation then can be
written as in equation 4.
h (t , )
N 1
t , exp
j
( t , )
( i (t ))
(4)
i0
If it is assumed that the channel impulse response

is static in time, the equation further reduces to the
form shown in equation (5):
N 1
h( ) i exp ( ji ( )) ( i )
(5)
i 0
A probing pulse p(t) is transmitted to measure the

impulse response of the channel. The response of
the channel to the probing pulse is measured as the
power delay profile. Power delay profiles describe
the power contained in the impulse response at
various time delays and can be represented as in
equation 6.
2
P ( t , ) k hb ( t ; )
(6)
Power delay profiles are generally calculated by
averaging the channel impulse response at different
spatial locations. The resultant profiles can be
rewritten as a time independent variable P() as
shown in equation 7.
2
P( ) k hb ( )
(7)
where k is the gain relating the transmitted power
in the probing signal to the received signal.
3 ULTRAWIDEBAND CHANNEL
MODELING
In this research, to demonstrate the procedure and
effectiveness of the FMM modeling techniques,
specifically for the amplitude of the impulse
response, |n|, the UWB channel is used as a case

study. An UWB system has characteristics very
different from a conventional narrowband system.
Therefore, more scrutiny of existing channel model
accuracy is required. This section gives a brief
introduction to ultrawideband systems and
discusses the reasons why investigations into UWB
channel models is warranted.
3.1 Ultrawideband systems
Ultrawideband communication systems use
extremely narrow (short duration) pulses as
building blocks for communicating between the
transmitter and receiver [17]. The pulses are
typically in the hundreds of picoseconds with duty
cycles less that 0.5%. This results in UWB systems
requiring very low average transmission power.
A communication system can be classified as UWB
in one of two ways.
If the fractional bandwidth Bf of the sytem
is greater than 20%
The bandwidth of the system, BW, is greater
than 500MHz irrespective of its Bf [18].
The fractional bandwidth, Bf, of a system is the ratio
of the bandwidth BW to the center frequency fc
given by equation 8 [18].
Bf
f h fl
BW
100%
100%
fc
f h fl
(8)
where fh and fl are the lowest and highest -10 db

cutoff frequencies.
3.2 Modeling of small scale statistics of
ultrawideband signals
The time resolution at the receiver is inversely
proportional to the bandwidth of the signal. Thus,
narrowband signals have wide time resolutions.
This poor time resolution results in a large number
multipath signals getting combined within a
particular time resolution bin. Given the random
nature of the channel characteristics, the receiver
can see either a constructive or destructive
multipath superposition. Since the received signal
171
is a combination of a large number of multipaths,
irrespective of the distribution of the individual
multipaths, the resultant signal can be assumed to
have a complex Gaussian distribution due to central
limit theorem [19] . Thus, the amplitude of the
signal of narrowband signals can be assumed to be
Rayleigh distributed [20].
On the other hand, the extremely wide bandwidth
of UWB makes the time resolution at the receiver
much narrower than conventional narrowband
systems. This means that only a few multipaths
combine together in a given time bin [21].
Therefore, the use of central limit theorem cannot
be justified and the assumption that the received
signal will have a complex Gaussian distribution
does not hold true. Thus, the amplitude distribution
of the ultrawideband signal cannot be assumed to
be Rayleigh distributed. Correspondingly, an
impulse sent through the ultrawideband channel
may not be received as a Rayleigh distributed series
of discrete time samples. In other words, the
impulse response of the channel may not be
Rayleigh distributed. This has resulted in increased
interest and research effort to determine, impulse
response amplitude, |n|, distribution for UWB
communication channels. Chong et al. used
Kolmogorov-Smirnov (K-S) test and Chi-square
test to suggest that Weibull distribution can be used
to model ultrawideband tap amplitudes in the 3
10 GHz range [22]. Cassioli et al. indicate that
Nakagami distribution provides the best fit for
ultrawideband amplitude statistics [23], while
Forester et al. suggest the use Lognormal
distribution [24]. For wireless channel modeling,
Taneda et al. introduced the use of information
theoretic model selection techniques by applying
Minimum Description Length (MDL) to find
channel tap amplitudes statistics [25]. Schuster et
al. introduced the use of another model selection
technique, Akaikes Information Criterion (AIC),
to identify appropriate models for ultrawideband
channel amplitude statistics [26]. Choudhary et al.
extended the use of a time series model selection
technique, the Accumulative Prediction Error, to
model UWB channel amplitude statistics [27].
Finite mixure models (FMMs) take advantage of
the vast number of applicable distributions for the
UWB channel impulse response amplitude by
forming a weighted combination of the individual

distributions. The effectiveness of this approach
will be detailed in this treatment. Specifically, in
this research we estimate FMMs for the UWB
channels using the data collected by Schuster et al
[26] and NIST [28] . The FMMs under
consideration use combinations of Rayleigh,
Nakagami, Lognormal, Weibull and Rice
distributions. Stochastic Estimation Maximization
is used to estimate the parameters of the FMMs.
Model selection techniques such as the Akaikes
Information Criterion (AIC) are used to identify the
best FMM combination for the UWB channel
model and to quantify the model complexity versus
accuracy tradeoff.
4. CHANNEL MEASUREMENT
Measuring
the
communication
channel
characteristics can be achieved in the frequency
domain via a transfer function or in the time domain
through direct impulse response measurement. In
time domain channel measurement, an impulse
signal is transmitted and the impulse response of
the channel is measured using a digital sampling
oscilloscope (DSO) [29]. Conversely, frequency
domain (FD) measurement is accomplished by
sweeping the channel with a series of
narrowband signals and measuring the frequency
response of the channel in the band of interest using
a vector network analyzer. The data used in this
paper, was derived from frequency domain channel
measurements [26]. In frequency domain channel
measurement, the channel sounding is done using a
series of sinusoids with frequencies in the band of
interest. The radio frequency RF sounding signals
are transmitted from the transmitter of a vector
network analyzer (VNA) and captured by the
receiver of the VNA placed at a distance d from its
transmitter. The requirement for FD channel
measurement is that the channel be static during the
period of measurement. In essence, this means that
the coherence time of the channel should be greater
than the time required to make the measurement.
172
4.1 Data measurement in typical university
environment
One set of data used in this research was measured
and reported by Schuster et al. in [26]. Figure 1
illustrates the set up for the data collection. An HP
8722D VNA is used to measure the channel transfer
function. To measure the transfer function, the
VNA is operated in a stepped frequency mode with
an IF bandwidth of 300 Hz. The sweep time was
9.8s. Each transfer function is recorded at 3201
equally spaced frequencies in the band from 2 GHz
to 8 GHz. As a result, the transfer functions are
measured with a frequency resolution of 1.875
MHz. Skycross SMT-3TO10M UWB antennas are
used as transmitting and receiving antennas. The
transmitted sounding signal has a 25dbm power
level. The transmitter (Tx) and receiver (Rx) of the
VNA are placed such that there exists a LOS
between Tx and RX. The Tx-Rx distances of d =
15.4m, 18.4m, 21.2m, 24.3m, and 27.2m are used
to collect the channel data. At each distance d, a 9x5
grid is created each separated with a spacing of 7
cms in both dimensions. At each location of the
grid, the receiver is placed and the channel transfer
function is measured. This process results in 45
transfer functions being measured. Next the entire
grid is shifted by 50cms and another set of 45
transfer functions are measured. Thus, for each d, a
total of 90 transfer functions are measured. The
individual transfer functions are inverse Discrete
Fourier Transformed to obtain the impulse response
of the channels. Thus for each d, 90 impulse
responses are obtained and each impulse response
has 3201 points. Sample impulse responses from
this data set are shown in figure 2. In summary,
there are 90 impulse response measurements at
each of the five values of d.
4.2 Data measurement in typical industrial
environment
The National Institute of Standards and Technology
(NIST) data set used in this research was measured
in an indoor industrial environment [28]. The
receiver of a vector network analyzer was placed on
a circular turn table with radius of 24 cms as shown
in figure 3. The distance from the center of the turn
table and the transmitter is d. The transmitter sends
out sine waves in the 2-8 GHz frequency range in

frequency increments of 1.2MHz.
Figure 1. Data collection layout in University environment

with Transmitter (Tx) at a distance of d from a grid describing
receiver positions
Figure 2. Normalized impulse response measured in a

university environment measured at transmitter-receiver
distance d = 15.4m at grid position (1,1)
Thus, at the receiver, transfer function of the

channel is measured at 4801 frequencies. An
inverse Fourier transform is applied to the transfer
function to obtain the impulse response containing
4801 impulses. For a particular value of d, the turn
table is rotated 96 times in increments of 360/96
degrees and at each location the impulse response
is measured. Thus, a total of 96 impulse responses
are measured. Impulse response data used in this
173
research corresponds to values of d equal to 13.3m,
18.2m, 21.37m, 24.72m, and 30.06m. Sample
impulse responses from NIST data set are shown in
figure 4. In summary, there are 96 impulse response
measurements at each of the five values of d.
Figure 3. Data collection layout in an industrial environment

with transmitter (Tx) positioned at a distance d from a turn
table holding the receiver.
5.1 Non-mixture models and

estimation for non-mixture models
parameter
The choice of the five probability density functions

(PDFs) is guided by their popularity for non
mixtures modeling of communication channels. It
should be noted that the process of statistically
modeling the channel is essentially the
identification of the parameters associated with the
statistical function that best describe the data. The
parameters of the non mixtures models can be
estimated using the maximum likelihood
estimation (MLE) technique.
5.1.1 Rayleigh distribution
The Rayleigh probability density function (pdf) is
given by equation (9).
px
x2
e 2
(9)
where x is the data and is parameter of the

distribution.
The maximum likelihood estimate of the Rayleigh
distribution is given by equation 10:
1
2N
x
i 1
2
i
(10)
5.1.2 Nakagami distribution

The Nakagami pdf is expressed as shown in
equation 11.
m
Figure 4. Normalized impulse response measured in an

industrial environment measured at transmitter-receiver
distance d = 26.29m at position 1.
5. PARAMETER ESTIMATION FOR NONMIXTURES DISTRIBUTIONS AND FINITE

MIXTURES
This section first describes the functional form of
the non mixture models and equations to estimate
the parameters of associated distributions. Next,
finite mixtures are introduced and the process
estimating parameters of the mixtures using
Stochastic Expectation Maximization (SEM) is
described.
m 1 2m1 x
px 2
x
e
m
(11)
represents the gamma function, m represents the

shape parameter and represents the scale
parameter. The estimate of m is given by equation
12.
E x
m
E x E x
2
(12)
where
is the fourth order
moment. The parameter is the second moment as
is estimated as in equation 13 [31].
174
N
E x 2 1 x 2
i
N i1
(13)
5.1.3 Rice distribution

The Rice pdf is represented as shown in equation
14.
p(x)
x 2 2 x
exp
I
2 0 2
2
2
x
(14)
When describing channels, the rice distribution is

commonly expressed in terms of the rice K factor
given by equation 15:
(15)
( K 1) x 2 2
x ( K 1)
p( x)
exp K
I 0 2 x
K ( K 1)
5.1.5 Lognormal distribution

The lognormal distribution is given by equation 21:
p( x)
where the relationship between , and K, is

given by equations 16 and 17 [32].
K
(16)
2
( K 1)
2
2( K 1)
The parameter a is called the shape parameter while

the parameter b is called the scale parameter. Linear
estimators are one of a number of possible methods
used to estimate the parameters of the Weibull
distribution. In this technique, the estimators are
linear combinations of the data [34]. Thus, the task
of estimating the Weibull parameters is converted
to the problem of estimating the parameters and
of the extreme value parameter. The parameters of
the extreme value distributions are related to the
shape and scale parameters of the Weibull
distributions as = ln(b) and =1/a
(17)
N
E x2 1
xi2
N i 1
(18)
The value of K can be estimated by solving the

following non-linear equation 19 [33]:
(3 / 2 )
E x
K
exp
2
E(x )
2
1 K
K
K
1 K I 0 2 KI 1 2

(19)
ln x 2
2
(21)
and are the parameters of the lognormal

distribution. It should be noted that if the
distribution of x is lognormal, then the distribution
of ln(x) of is normally distributed with mean and
standard deviation . The parameters of the
lognormal distribution are given by equation 22 and
23.

The parameter is the second moment as is
estimated as shown in equation 18.
1
N
1 N
lnxi
N i 1
N
(22)
lnx
i 1
(23)
5.2 FMMs and parameter estimation of FMMs

Finite mixtures provide an effective way of
modeling data that is multimodal in nature and
cannot be accurately described by a non-mixture
model [35]. Examples of such variables would be
the second impulse from the Schuster et al. impulse
response at d = 15.4m or the tenth impulse of the
NIST data at d = 21.53m shown in figure 5.
5.1.4 Weibull distribution

The Weibull pdf is given by equation 20:
x a
a x
p(x) e b
b b
a1
(20)
175
16
=
,
,
,
,
,
are the
parameters of the mixture model. Further,
| represents the PDF of the kth constituent
distribution and wk represents the mixing
coefficient of that PDF
14
12
counts
10
In this paper, the Stochastic Estimation

Maximization (SEM) is used to estimate the FMM
parameters. SEM is an iterative technique similar to
the conventional Expectation Maximization
Technique (EM). SEM estimates both the mixing
coefficients as well as the parameters associated
with the constituent PDFs of FMM. However,
unlike the EM algorithm, the SEM technique does
not require that the exact number of the FMM
components be specified. Only the maximum
number of components in the mixture needs to be
known. The SEM algorithm is a three step process
as described below [37].
8
6
4
2
0
0.2
0.3
0.4
0.5
0.6
amplitude bins
0.7
0.8
0.9
(a)
20
18
16
Step 1: As in the basic EM algorithm, in the nth

iteration, the first step is the Expectation step (Estep). This involves finding the posteriori
probability or the probability that X is generated by
the kth constituent PDF with parameter k. The
probability can be calculated as shown in equation
25.
14
counts
12
10
8
6
P kn | x i
4
2
0
0.2
0.4
0.6
amplitude bins
0.8
Figure 5. Histograms of showing multimodality in UWB

channel data for (a) second impulse of Schuster al. at d =
15.4m (b) tenth impulse of NIST data at d= 21.53m.
The FMMs represent the PDF of a variable X as a

weighted sum of constituent PDFs as represented in
equation 24 [36]:
M
w p (X | )
k
w kn P x i | kn
K
k 1
(b)
p X |
P ( kn ) P x i | kn
P x i
(24)
w kn P x i | kn
(25)
In the first iteration, an initial guess is made for

the mixing coefficients wk and the component
parameters k.
Step 2: The second step is the Stochastic step (Sstep). In this step, based on the posteriori
probabilities, regroup X into K groups X1, X2, ,Xk.
For example, if max{P(1|x1), P(2|x1), ,
P(k|x1)} is P(2|x1), then place the sample x1 in
group X2 (where max{} is a function that finds the
highest value among its input arguments).
k 1
p(X|) represents the finite mixture model of

variable X.
Step 3: The third step is the maximization step (Mstep) which involves updating the mixing
coefficients and parameters of the constituent PDFs
176
for the (n+1)th iteration. The mixing coefficients,
, are given by the ratio of the samples in
in the nth iteration to the total samples N. The
parameters of the kth constituent PDF are estimated
using the maximum likelihood estimation
technique based on the data in Xk are given by
equation 26.
kn 1 arg max ( L ( k ))
k
arg max log P X k | k
(26)
arg max log P x in,k | k
i 1
The new value of k for the (n+1)th iteration is the

value of the parameter that maximizes the
loglikehood of the kth component given the data Xk.
In the above equation , is the ith element in Xk
and M represents the number of elements in Xk for
the nth iteration.
In the E-step of each subsequent iteration, the
updated values of w and computed in the M-step
of the previous iteration are computed. The process
is repeated until the estimates of w and in M-steps
of successive iterations do not change significantly.
for complexity arises from the idea that complex

models tend to overfit the currently available data
since they may be strongly influenced by noise in
the data.
Model selection techniques (including AIC) that
have roots in information theory, typically, are
variations of or are derived from the KullbackLeibler (KL distance). The KL distance I(f,p) is an
information theoretic measure of how closely a
distribution p(x) approximates the true but
unknown distribution f(x). The KL distance is given
by the formula in equation 27 [39].
f ( x)
I ( f , p ) f ( x) log
p( x)
f ( x)log( f ( x)) log( p( x))
(27)
I(f,p) represents the loss information when p(x) is

used instead of f(x), with log
representing
information. Model selection algorithms based on
this information theoretic measure attempt to
choose a distribution that has a very small value of
I(f,q) . This means that model selection algorithms
choose models that minimize the KL distance.
Since the parameters, p(x), of the distribution under
test are estimated using MLE from the data, the
I(f,p) is also an estimated distance represented by
,
6. Model Selection Techniques

Since this research introduces the use of FMMs, a
seemingly more complex model, the question of
whether FMMs are worth the complexity they bring
to the modeling process needs to be addressed. To
answer this question, appropriate statistical metrics
or model selection techniques are used. Given the
data and a group of potential PDFs, model selection
techniques can be used to select a PDF that
provides the best representation for the data. A
good model selection technique should be capable
of evaluating the trade off between the goodness
of the fit provided by a candidate model and the
simplicity of the model. Model selection techniques
are generally designed to penalize complex models
[38]. A complex model will only emerge as a
winning model if its data fitting ability outweighs it
structural or parametric complexity. The penalty
This section provides an overview of three model

selection techniques: Akaikes Information Criteria
(AIC) [41], Minimum Description Length (MDL)
[42] and Accumulative Prediction Error (APE)
[43]. These three model selection techniques are
used in this research to find the best model (PDF)
for the channel impulses.
6.1 Akaikes Information Criterion (AIC)
Akaikes Information Criterion further simplifies
the KL distance to make it easier for practical
implementation and is given by equation 28 [41].
AIC 2 log( p ( x | ) 2 K
(28)
177
where K is the number of parameters in the
distribution p(x). The smaller the value of AIC the
closer the p(x) represents the unknown true
distribution f(x). To compare the relative fit of the
distributions under consideration, a difference, ,
is computed. , is difference between the AIC
value of each distribution and the minimum AIC
value in the set. Thus difference value for the
distribution with minimum AIC value is zero. The
AIC weights for a particular distribution are
computed as shown in equation 29.
wj
e
M
1
2
e
m 1
K is the number of parameters in the distribution

under test and N is the number of data samples.
Thus, it can be seen that DL is a sum of two
expressions. The first term measures the how well
the model under evaluation fits the data and the
second term is measure of the complexity of the
model or the number of parameters. A better data
fit and reduced model complexity will result in a
smaller description length. This results in the
increase of the description length.
6.3 Accumulative prediction error (APE)
(29)
1 m
2
where
1 and M is the number of
distributions under test. Higher values of wj
indicate that the distribution provides a better the fit
to the data. Thus, among the group of distributions
under consideration, the distribution with the
highest value of wj can be chosen as the model best
representing the data.
6.2 Minimum description length
The MDL algorithm loosely states that the most
appropriate model to describe a data set is the one
which uses the least number of code words or the
shortest description of the data [42]. MDL can be
seen as finding the model that compresses the data
the most and effectively eliminates models that
overfit. MDL involves choosing a model which has
the smallest description length (DL) for the data set.
DL is calculated as shown in equation Error!
Reference source not found.30.
1
DL log( L( | x)) K log( N )
2
1
log p x | K log( N )
2
(30)
Wagenmakers et al. state that the sole requirement

of APE is that the models under consideration are
capable of generating a prediction for the next
unseen
data
point
[43].
In
practical
implementations, the observed data is split into two
parts: the seen or the training data set and the
unseen or testing data set. Consider a data set of N
observations given by,
i = 1,2, N.
Initially S observations are placed in the training
data and the remaining N - S in the testing data.
Based on the training data, the parameters of the
distribution under test are estimated using MLE.
One data point is picked from the testing set and the
value of the log conditional probability
ln
| for that point is calculated. The larger the
probability value, the better the distribution
predicts the data point and hence the smaller the
prediction error. This data point is then transferred
to the training data set, increasing its size by one.
Based on this training data set, the parameters of
the distribution under testing are recalculated.
Another point is picked from the testing set and the
value of the log conditional probability ln
| is
calculated for that data point. The data point is then
transferred into the training data set. This process
is repeated till all N S points in the testing data set
are covered. The APE value is then given by the
sum of log probabilities calculated for all N S data
points of the testing data set.
178
Figure 6. Plot of AIC weights vs excess delay for model ranked first, Rayleigh-Lognormal FMM, and model ranked third for
university environment UWB data for d= 1
Figure 7. Plot of AIC weights vs excess delay for model ranked first, Rayleigh-Lognormal FMM, and model ranked third for
industrial environment UWB data for d= 21.3m.
179
7. UWB channel model selection results

For each Tx-Rx distance d, non mixtures and FMM
based models were estimated for each significant
impulse of the impulse response based on their 90
samples ( Schuster et al.) or 96 sample ( NIST data).
FMMs consisted of at most 3 component PDFs
drawn from the five PDFs (Rayleigh, Rice,
Nakagami, Weibull and Lognormal). SEM was
used to estimate the FMMs for each significant
impulse. Thus for each impulse, a total of 25
models were estimated: 20 FMMs (3-component
and 2 component FMMs drawn from 5 PDFs) and
5 homogeneous models. The models were ranked
using AIC model selection technique for each
impulse. Analysis of the model selections indicate
that none of the models (FMM or non-mixtures)
was consistently ranked first. Hence the
identification of the most appropriate model for the
impulse response is done based on the number of
impulses for which a model ranked among the top
3 out of 25 competitors. Table 1 through table 3 list
the percentage of impulses for which a model
belonging to non-mixtures, 2-component FMM,
and 3-component FMM is ranked among the top 3
for university environment data. It can be seen that
the Lognormal distribution is the best among nonmixture since it is ranked among the top 3 for over
36% of the impulses. Among the 2-component
FMMs, Rayleigh-Lognormal FMM stands out
since it is ranked among the top 3 for almost 70 %
of the impulses. Among the 3-component FMMs,
the FMM composed of Nakagami, Weibull, and
Rice distributions performs the best with it being
ranked among the top 3 for almost 20% of the
impulses. The results of the tables are summarized
in figure 7 where the performance of best three
models is illustrated. It can be seen that for the
university data the Rayleigh-Lognormal FMM
performs best in terms of the number of impulses
for which it is ranked among the top 3. This
indicates that the Rayleigh-Lognormal FMM can
be an effective alternative to the conventional nonmixtures for modeling channel impulse response.
Similarly, tables 4 through table 6 list the
percentage of impulses for which a model
belonging to non-mixtures, 2-component FMM,

and 3-component FMM is ranked among the top 3
for industrial environment data. As in the case of
university data, for non-mixtures and 2-component
FMMs, Lognormal and Rayleigh-Lognormal FMM
perform the best. For 3-component FMMs
Nakagami-Lognormal-Weibull FMM performs the
best. The bar chart in figure 8 summarizes the tables
and indicates that the Rayleigh-Lognormal FMM is
the best model since it is ranked among the top 3
models for over 62% of impulses. As in the case of
university data, the Rayleigh-Lognormal FMM
performs is a more effective alternative to
conventional non-mixtures for in modeling
wireless channel impulse responses for industrial
environments as well.
Figures 6 and 7 are shown for further analysis in to
the performance of the Rayleigh-Lognormal FMM
in terms of normalized AIC weights. In figure 6, if
a vertical line is drawn at any time instant, three
symbols will be encountered: the circle
representing the AIC weight of the best model for
the impulse arriving at that instant of time, the
triangle representing the AIC weight of the third
best model for impulse arriving at that instant of
time and the asterisk representing the AIC weight
of the Rayleigh-Lognormal FMM for the impulse
arriving at that instant of time. Note that AIC
weights are normalized AIC values which assign
higher values to better models. When the asterisks
in the plot intersect the circles, it indicates that the
Rayleigh-Lognormal FMM is the best model for
the impulse arriving at that instant of time. Many
such intersections of circles and asterisks can be
seen. Also it can be seen that the asterisks like on
or above the triangle symbols for most time
instants. This means that most of the AIC weights
of the Rayleigh-Lognormal FMM are either first,
second or third highest in value for most of the
impulses. This basically indicates that the
Rayleigh-Lognormal FMM is among the top 3
models for a majority of impulses. A similar
conclusion can be drawn for industrial data based
on figure 7.
180
Table 1. Performance of non-mixture models (University
Environment)
Non-Mixture model
% of impulses for which
model ranked in top 3
Lognormal
Rice
Weibull
Rayleigh
Nakagami
36.8
6.7
3.9
3
1.1
Table 2. Performance of 2-component FMMs

(University Environments)
2-component FMMs
model is ranked in top 3
Rayleigh-Lognormal
69
Nakagami-Lognormal
54
Weibull-Lognormal
8.3
Rice-Lognormal
3.9
Nakagami-Weibull
3.5
Nakagami-Rice
3.4
Rayleigh-Weibull
2.9
Nakagami-Rayleigh
1.9
Rice-Rayleigh
1.9
Rice-Weibull
0.5
Table 3. Performance of 3-component FMMs
(University Environment)
3-component FMMs
Nakagami-Rice-Weibull
19.3
Nakagami-Lognormal10.6
Weibull
Rice-Lognormal-Weibull
8.3
Rayleigh-Lognormal3.9
Weibull
Rice-Rayleigh-Lognormal
3.5
Nakagami-Rayleigh-Rice
3.4
Nakagami-Rayleigh2.9
Weibull
Nakagami-Rayleigh1.9
Lognormal
Nakagami-Rice-Lognormal 1.9
Rice-Rayleigh-Weibull
0.5
Table 4. Performance of non-mixture models (Industrial
Environments)
Non-Mixture model
Lognormal
46.1
Rice
4.2
Weibull
3.7
Rayleigh
3.7
Nakagami
0.6
Table 5. Performance of 2-component FMMs (industrial

environment)
2-component FMMs
Rayleigh-Lognormal
62.1
Nakagami-Lognormal
51.2
Weibull-Lognormal
33.1
Rice-Lognormal
21.0
Nakagami-Weibull
2.1
Rayleigh-Weibull
1.4
Nakagami-Rice
0.9
Nakagami-Rayleigh
0
Rice-Rayleigh
0
Rice-Weibull
0
Table 6. Performance of 3-component FMMs (industrial

environments)
3-component FMMs
Nakagami-Lognormal20
Weibull
Rice-Rayleigh-Lognormal
11
Rayleigh-Lognormal8.8
Weibull
Nakagami-Rice-Weibull
7.9
Rice-Lognormal-Weibull
5
Nakagami-Rice-Lognormal 4
Nakagami-Rayleigh-Rice
3.3
Nakagami-Rayleigh3
Weibull
Nakagami-Rayleigh1
Weibull
Rice-Rayleigh-Weibull
0
8. Conclusions
In this research, a new approach to modeling
wireless channels using finite mixtures is
presented. The process and effectiveness of FMMs
is demonstrated by modeling the amplitude
statistics of the ultrawideband channel impulse
response. The ultrawideband data represented two
types of channel: industrial and typical college
environment. FMMs used here were composed of
distributions drawn from five commonly used non
mixture channel models, namely Rayleigh,
Nakagami, Rice, Weibull and Lognormal
distributions. The component distributions of
FMMs were chosen based on the wide usage in
channel modeling literature. The various 2181
component and 3-component FMMs were
estimated and compared among themselves and
with non-mixtures models using statistical metrics
such as AIC. The statistical metrics indicate that the
FMM composed of Rayleigh and Lognormal
distributions consistently ranked high in their
ability to provide accurate statistical description for
most of the impulses of channel impulse response.
This research has thus established RayleighLognormal FMM as an alternative to conventional
non-mixture FMMs. Scrutiny of the weighting
coefficients of the Rayleigh-Lognormal FMM
indicate that the FMM has higher contributions
from the Rayleigh component for higher amplitude
impulses of the impulse response, while the FMM
takes higher contribution from the Lognormal
component for the low amplitude impulses.
Following the Rayleigh-Lognormal FMM, the
Nakagami-Lognormal FMM performed second
best and the Lognormal non-mixture performed
third best in terms of the percentage of impulses for
which they ranked among the top three models.
From these observations, it can be inferred that
Lognormal distribution is an important contributor
to true data model of the UWB channel data. The
Rayleigh and Lognormal non mixtures showed
trends similar to the Rayleigh-Lognormal FMM
coefficients in that Rayleigh performed well for
high amplitude impulses and Lognormal performed
well for low amplitude impulses. It is also observed
that the non-mixture Rayleigh and Lognormal
parameters are smoother in terms of variations than
that of the Rayleigh-Lognormal FMM. This could
be because of the complexity of the FMMs have
tendency to over fit the data. Visual inspection of
the histograms of the impulses showed that many
of the impulses had two modes. This can be linked
to the success of the 2-component FMM in
modeling the data. The poor performance of the 3component FMMs can be attributed to their
increased complexity that gets penalized by the
model selection techniques. Also, the number of
data samples was limited; hence the estimation of
the parameters of the 3-component FMMs may not
have been accurate. The statistical samples
available for this research were very limited. The
parameter estimation process and the model
selection process will be more accurate with the
inclusion of other UWB data sets with large number

of statistical samples.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
Tripathi, N., Reedy J., and VanLandingham, H.,: Radio

Resource Management in Cellular Systems, Springer
(2001).
Hashemi, H.: Indoor propagation channel model. In:
Proc. IEEE, 18, 943--968, (1993).
Hoffman, H., and Cox, D.: Attenuation of 900 MHz
radio waves propagating into a metal building. IEEE
Trans. Antennas and Propagat., 30, 808--811 (1982).
Yegani, P., and McGillem, C. D.: A Statistical model
for line-of-sight (LOS) factory radio channels. In:
Proc. Vehicular Techn., 35, 14--152 (1986).
Gutzeit, C., and Baran. A.: 900 MHz indoor/outdoor
propagation investigations via bit error structure
measurements. In: IEEE Proc. Vehicular Techn. Conf.
VTC 89, San Francisco, 321--328 (1998) .
Bultitude, R.: Propagation characteristics of 800/900
MHz radio channels operating with buildings. In: Proc.
13th Biennial Symp. Commun., Ontario, Canada, 2--4
(1986).
Kalivas, G., Tanany, M., and Mahmoud, S.:
Millimeter-wave channel measurements for indoor
wireless communications. In Proc. IEEE Vehicular
Technology Conf., VTC 92, Denver, Colorado, 609-612 (1992).
Laurntez, J. P.: Wireless Communication [Online].
www.wirelesscommunication.nl.
D. Kim, M. Ingram, and W. Smith.: Measurements of
small-scale fadings and path loss for long range RF
tags. IEEE Trans. Antennas and Propagat., 51, 1740-1749 (2003).
Howard, S. and Pahlavan, K.: Fading results from
narrowband measurements of the indoor radio channel.
In: IEE Proc. Part I: Communication, Speech and
Vision, 138, 153--161 (1991).
Shepherd, N. H.: Radio wave loss deviation and
shadow loss at 900 MHz. IEEE Trans. Vehicular
Techn., 26, 309--313 (1977).
Rappaport T. and McGillem, G.: UHF fading in
factories. IEEE J. Selected Areas in Communications,
7, 40--48 (1989).
Seidel, S. and Takamizawa, K.: Application of secondorder statistics for an indoor radio channel model. In:
IEEE Proc. Vehicular Technology Conf., VTC 89, San
Francisco, 888--892 (1989).
Rappaport, T., Seidel, S., and Takamizawa, K.:
Statistical channel impulse response models for factory
and open plan building radio communication system
design. IEEE Trans. Commun., 39, 794--807, (1991).
Hashemi, H., Tholl, D., and Morrison, G.: Statistical
modeling of the indoor radio propagation channel
part I. In: Proc. IEEE Vehicular Technology Conf.,
VTC 92, Denver, Colorado, 338--342 (1992).
182
[16] Hashemi, H., Tholl, D., and Morrison, G.: Statistical
modeling of the indoor radio propagation channel
part II. In: Proc. IEEE Vehicular Technology Conf.,
VTC 92, Denver, Colorado, 839--843 (1992).
[17] Nekoogar, F.: Ultra-Wideband communications:
fundamentals and applications. Prentice Hall (2005).
[18] Siwiak, K. and McKeown, D.: Ultra-wideband Radio
Technology. Wiley (2004).
[19] Papoulis, A.: Probability, random variables and
stochastic processes. McGraw Hill (2002).
[20] Proakis, J.: Digital Communications. McGraw Hill
(2000).
[21] Luediger, H., Kull, B., Zeisberg, S., and Finger, A.: An
Ultrawideband indoor NLOS radio channel amplitude
probability density distribution. In: IEEE Proc. Spread
Spectrum Techniques and Applications Symposium, 1,
68--72 (2002).
[22] Chong, C. and Yong, S.: A generic statistical-based
uwb channel model for high rice apartments. IEEE
Trans. Antennas and Propagat.,53, 2389--2399, (2005).
[23] Cassioli, D., Win, M., and Molisch, A. F.: The ultrawide bandwidth indoor channel: from statistical models
to simulations:
IEEE J. Selected Areas in
Communications, 20, 40--48 (2002).
[24] Foerster, J. R., and Li, Q.: UWB channel modeling
contribution from intel. IEEE, Tech. Rep. P802.15
02/279SG3a, 2002, IEEE P802.15 SG3a Contribution.
[25] Taneda, M. A., and Araki, K.: The problem of the
fading model selection. IEICE Trans. Commun., E84B, 660--667 (2001).
[26] Schuster, U. and Bolcskei, H.: Ultrawideband channel
modeling on the basis of information theoretic criteria.
IEEE Trans. on Wireless Communications, 6 (2007).
[27] Choudhary, D. and Robinson, A.: Model selection and
Kolmogorov-Smirnov test for Ultrawideband channel
modeling. In: SPIE Proc. Wireless Sensing and
Processing II, 6577, 65770G-1 -- 65770G-7 (2007).
[28] Gentile, C., Braga, A. J., and Kik, A.: A comprehensive
evaluation of joint range and angle estimation in indoor
Ultrawideband location systems. EURASIP Journal on
Wireless Communications and Networking, 2008,
(2008).
[29] Oppermann, I., Hamalainen, M., and Iinatti, J., UWB:
theory and applications. Wiley (2004).
[30] Ciccognani, W., Durantini, A., and Cassioli, D.: Time
domain propagation measurements of the UWB indoor
channel using PN-sequence in the FCC compliant band
3.6-6GHz. IEEE Trans. Antennas and Propagat.,53,
1542--1549 (2005).
[31] Cheng, J. and Beaulieu, N. C.: Maximum-likehood
based estimation of the Nakagami m parameter. IEEE
Communication letters, 5, 101--103 (2001).
[32] Stuber, G.: Principles of mobile communications.
Springer (2000).
[33] Talukdar, K. K., and Lawing, W. D.: Estimation of the
parameters of the Rice distribution. J. Acoust. Soc.
Am., vol. 89, no. 3, pp. 1193-1197, Mar. 1991.
[34] Murthy, D. N. P., Xie, M., and Jiang, R., Weibull

Models. Wiley (2003).
[35] Schnatter, S. F., Finite mixtures and Markov models.
Springer (2002).
[36] McLachlan, G., and Peel, D., Finite mixture models.
Wiley (2000).
[37] Celeux, G., and Diebolt, J.: The SEM and EM
algorithms for mixtures: numerical and statistical
aspects. In: Proc. 7th Franco-Belgium Meeting of
Statistics, Brussels (1994).
[38] Zellner, A., Keuzenkamp, H., and McAleer, M.:
Simplicity, inference and modeling: keeping it
sophisticatedly simple. Cambridge Press (2001).
[39] Kullback, S.: Information theory and statistics. Dover
Publications (1997).
[40] Myung, I. J.: The importance of complexity in model
selection. Journal of Mathematical Psychology, 44,
190--204 (2000).
[41] Akaike, H.: Information theory as an extension of
maximum likelihood principle. In: Selected papers of
Hirotugu Akaike. Akaike, H., Kitagawa, G. and Parzen,
E. (eds.) Springer, 1997, pp. 199-214.
[42] Hansen, M., and Yu, B.: Model selection and the
principle of Minimum Description Length. Journal of
American Statistical Association, 96, 746 --774
(2001).
[43] Wagenmakers, E. J., Grunwald, P., and Steyvers, M.:
Accumulative prediction error and the selection of time
series models. Journal of Mathematical Psychology,
50, 149--166 (2006).
183
Towards Carbon Emission Reduction Using ICT.

Tiroyamodimo M. Mogotlhwane
Department of Computer Science, University of Botswana
Private Bag 0022, Gaborone, Botswana
mogotlhw@mopipi.ub.bw
ABSTRACT
The impact of global warming is now showing its
ability to disturb human and other forms of life on the
earth. Environmental pollution and how it can be
minimised is a global issue for discussion. There is an
increase in promoting human activities that are
environmental friendly through the green initiatives.
Green computing is how the computing profession is
responding to concern for minimising environmental
pollution. Internet based operations enable work to be
done remotely minimising the need for human
movement. Automation of processes, one of the
benefits of computing minimise waste and indirectly
reduce the need for human involvement. Fossil fuels
used as fuel for vehicles are some of the leading causes
of carbon emission. The ability for people to work
from home (telecommuting), access banking services
online (online banking) and other online based
operations have the potential to reduce carbon
emissions. These online operations are computer driven
and enable professional expertise/service to be
accessed without the need for people to travel.
Computing applications have the potential to reduce or
minimise carbon emission. More concerted effort is
required from policy makers to embrace computing
technologies as tools to reduce carbon emissions.
KEYWORDS
Carbon tax, renewable energy, ICT, greenhouse gas,
carbon emission, green IT, green computing.
1 INTRODUCTION
The focus of this research is to review literature
that investigates computing contribution to
reduction of carbon emissions. Desktop research
methodology was used to review related work on
the subject matter. There are different approaches
that can be used to reduce carbon emission. Some
of the approaches are; reducing energy

requirements consumption of a machine, using
renewal energy instead of fossil fuel etc [1]. The
power sector can reduce carbon emission through
the use renewable energy, capture and store carbon
dioxide generated. This paper addresses the issue
of carbon emission reduction by looking at how
use of information and communications
technology (ICTs) can contribute to reduction of
carbon emission. Computer based information
systems that are web based have the potential to be
used anytime anywhere. Some of the challenges
limiting the use of information and communication
technologies are also discussed.
1.1 Definitions
Some of the key terminologies that are used
widely in environmental pollution are stated here.
This is mainly for the benefit of the computer
science community who now need to embrace
environmental agenda in their professional
activities. The key word in this paper is carbon
emission. The phrase carbon emission is used
widely in academic literature with very little
attempt to define it. The online Macmillan
dictionary defines it as a combination of carbon
dioxide and carbon monoxide that is added to the
atmosphere from use of fossil fuel by cars and
other machinery as part of industrial process [2].
Clean energy is a source of energy that meets
current energy demands without compromising the
needs of future generations to meets their energy
requirements [3]. Some of the sources of clean
energy are solar power, wind power, biomass, sea
currents etc. Carbon dioxide and carbon monoxide
are harmful to animal life, though they contribute
to food production life cycle as they support plant
life when in moderation. Greenhouse gas (GHG) is
184
defined as all the gases that act to retain heat in the
atmosphere [4]. This heat retention by GHG leads
to increase in atmospheric temperature which is
often referred to as global warming [5].
Greenhouse gases that are naturally produced are
absorbed by plants. However, as a result of human
activities that produce GHG, such increase cannot
be absorbed by existing plants/vegetation. This is
worsened by increase in destruction of natural
forests that reduce plants and other vegetations
available.
2 AN OVERVIEW OF GREENHOUSE
GASES.
Fossil fuels are the main source of electricity
globally yet they are the main contributor of
increase in GHG. Like Stickley stated fossil fuels
are what make the world go round [5]. Globally,
not enough has been made to invest in renewable
energy which produce less carbon emission. Use
of solar power even in countries with higher solar
radiation is still limited. For example Botswana
has about 3200 hours of sunshine annually, but
this energy contributes less to national electricity
demand [6]. There is over dependence on fossil
fuel and in the US, petroleum, coal and natural gas
have been the main top three source of energy
since the mid 1920s [7]. Global data show that if
fossil fuels continue to be the main source of
energy, there will be increase in carbon dioxide
emissions [8]. Developing countries are likely to
overtake industrialised countries in terms of
carbon emission as shown in Figure 1.
Some of the industrialised countries are already
implementing policies that aim to reduce carbon
emissions. In the US there is a tax incentive
offered to home owners towards the cost of
installing solar panels [9]. Australia is also using
the tax incentive model. Germany despite having
limited solar radiation, account for more than 50%
of solar energy production [10]. Developing
countries especially those in Africa have higher
solar radiation. However the use of this renewable
energy has not been exploited fully in such
countries.
Figure 1 World Carbon Dioxide Emissions by Region, 20012025 (Million Metric Tons of Carbon Equivalent) [8]
2.1 ICTs Carbon Emission Contribution

ICTs rely on electricity to run, for example every
computer requires electricity to operate.
Depending on the source of such electricity, ICT
can also be looked at as a contributor to
greenhouse gas, or contribute to increase in carbon
emissions. For ICTs to be totally green, their
source of electricity must be from green sources as
well.
Recent research has shown that ICT based systems
if fully implemented have the potential to reduce
carbon emissions by 16.5% [11]. Some of the ICT
applications that have the potential to reduce
carbon emission are the concept of working from
home. This minimise the need for employees to
travel to work. In majority of cities traffic jams are
common. A car stuck in traffic with a running
engine is burning fuel and yet not in motion.
Hence this is not an efficient way to use energy.
However not many businesses have implemented
the concept of work from home not because of
lack of the technology, but mainly for other social
reasons. Financial institutions have embraced
online banking mainly to reduce the need to have
branches, but indirectly this also contribute to
carbon emission reduction from customers not
having to travel to a bank branch.
City of Las Vegas in its efforts to reduce
electricity wastage has implemented ICT based
185
systems that can remotely switch of computers

that are not in use. Through this process the city
managed to save about $50 000 per annum [12].
2.2 Green Computing
Green computing or green IT is the use of
computer and its related resources in an
environmental friendly manner. From the design,
manufacturing, use and disposable of hardware
and other related resources [13]. The main
concepts of green computing are as follows:
Green use: Using computer devices in a
way
that
minimises
electricity
consumption. Example switching off a
device when not in use.
Green disposal: When computer devices
are no longer needed, they can either be recycled or disposed off in an environmental
friendly manner. Not just thrown in the
rubbish bin to be disposed of in a local
landfill. Almost every year there is a new
computer device/hardware that emerges in
the market that is better than the previous
ones. What happens to the hardware that is
considered to be obsolete?
Green design: Designing computer devices
that are energy efficient. Energy-efficient
computers, servers, printers, projectors and
other digital devices reduce their energy
consumption.
Green manufacturing: When producing
computer hardware devices minimise
waste to reduce the amount of material that
is thrown away. Minimising waste during
the manufacturing of computers and other
subsystems reduces the environmental
negative impact of such activities.
An average computer user can employ the
following to make their computing green [13]:
Use the sleep or hibernate mode when
away from a computer for extended periods.
Use flat-screen or LCD monitors, instead

of conventional cathode ray tube (CRT) monitors.
Buy energy efficient notebook computers,
instead of desktop computers.
Activate the power management features
for controlling energy consumption.
Make proper arrangements for safe
electronic waste disposal.
Turn off computers at the end of each
working day.
Refill printer cartridges, rather than buying
new ones.
Instead of purchasing a new computer, try
refurbishing an existing one.
There is increasing interest in green computing by
the academic community. For example googling
green computing produces about 7650 results
from the Google scholar online search engine.
2.3 Cloud Computing
The concept of cloud computing is transforming
the general delivery and management for corporate
IT services. Through the cloud large scale and
shared IT infrastructure is provided to the users
over the internet. This makes software provision to
be almost like a utility commodity. Recent
research by Microsoft show that the cloud based
operations as compared to on-premises can enable
companies to reduce their carbon emissions by
between 30% and 90% [14].
Using cloud computing enable organisations to
access their information from any where at any
time. This is in contrast to traditional computing
where the user has to be in the same location as
their data storage. Cloud computing removes the
need to be in the same physical location as the
hardware and software processing the data [15].
The key ways through which cloud based
oprations reduce carbon emissions per user are
through the following [14]:
Reduction of over allocation of
infrastructure.
Sharing
applications
between
organisations.
Higher utilisation of server infrastructure
186
Improvement in data centre efficiency

Cloud computing has the potential to provide high
computing power to even small businesses. At the
same time it can contribute significantly to
reduction of ICTs carbon emissions. Under cloud
computing, the cost of hardware and software is
taken up by cloud services providers just like other
utilities providers like water, electricity.
3 ICT ROLES IN CARBON EMISSION
Todays human life activities are dominated have
ICT applications as the underlying technology that
drives them. Businesses rely on it to run; it also
supports social life as evidenced in the use of
social media like Facebook and Twitter. People
are able to organise association, share information
and take action without the need for face to face
meeting. This interaction in cyber space does not
require physical movement of data and people,
less transportation indirectly means less use of
energy.
GHG emissions are strongly correlated with
economic activity as shown by their decline in
2008 economic slowdown [4].
Figure 2 shows the main sources of GHG globally.
Using fossil fuels to produce energy is the leading
source of GHG. The other sources are all activities
that are carried out to support consumption needs
of people. To make significant reduction in GHG
requires looking and using energy sources that do
not use fossil fuels. Such energy sources are
referred to as clean or green energy sources.
Example of clean energy sources are wind, solar,
water etc.
Clean energy comes from sources that cannot be
depleted like fossil fuels. Fossil fuels do not only
contribute to increase in GHG, they are not likely
to last for a long time as a time will come when
they have completely been depleted. Clean sources
of energy like the sun will always be there as long
as the sun rays reach the earth surface.
Figure 2 Global greenhouse gas emission by source (adapted

from [16]
3.1 ICT and Energy Supply

ICT can reduce carbon emission from energy
generation by using computerised systems that
manage the demand of energy. For example,
Botswana Power Corporation has installed smart
meters in residential areas. This allows electricity
users to manage their energy consumption and
minimise the need to travel to the designated
paying points as was the case before. The ability to
even buy electricity, pay bills online or through
use of mobile phones further reduce the need to
travel. The corporation was also able to remotely
switch geysers to balance electricity demand
during peak hours through the use of smart meters
[17]. Osaka Gas, a company in Japan, has used
server virtualisation to reduce the companys
electricity costs, hence reducing its carbon
emission [18].
Demand management, time of day pricing, power
loading balancing etc are all ICT based processes
that can minimise energy production wastage.
Minimising wastage indirectly reduces the need to
generate more energy.
ICT applications can manage electricity
production, use and distribution. The main reason
for using ICT applications is to minimise
187
electricity wastage. Reducing electricity wastage is
also a form of carbon emission.
3.2 ICT and Industry GHG
Industries that produce goods are some of the key
economic activities that many countries aim to
have. Industrial production of goods generates
employment for people besides producing such
goods for customers. Industrial manufacturing is
one of the highest carbon emission sources. This is
due to electricity demand for its processes. Hence
when the source of such electricity is from fossil
fuels, it indirectly contributes to carbon emission.
Industrial processes that use electricity sourced
from renewable energy sources contribute less to
carbon emissions.
Higher manufacturing in emerging economies like
China, India etc are increasing carbon emissions.
Lack of enough economic incentives to reduce
carbon emission in manufacturing is a great
challenge [19].
Automation of industrial
processes and monitoring can also reduce energy
consumption by the manufacturing industry. ICT
is required in this automation and monitoring as it
acts as the underlying driving technology.
3.3 ICT and Agriculture GHG
There is growing demand to produce more food to
feed the increasing global population. Methane
from cattle and irrigation using fresh water
contribute to increase in carbon emission.
Agricultural GHG emission can be reduced by
making
agricultural
processes
efficient,
minimising wastage in the production process.
Farmers in developing countries are slowly
adapting to ICT based solution for example using
mobile phones to find prices and markets for their
produce.
According to the World Bank, ICT can be used to
monitor pest thresholds in integrated pest
management, provide relevant and timely
information and agricultural services, map agro
biodiversity in multiple-cropping systems, forecast
disasters, and predict yields. Crop losses diminish
as farmers receive relevant and timely information
on pests and climate warnings through SMS
technology [20].
ICT applications in agriculture can enable

information sharing between farmers and
extension officers to get advice on their farming
practices. Access to this information provided
online means those farmers can access it without
the need to travel.
3.4 ICT and Buildings GHG
Most buildings have not been constructed to be
energy efficient. About 60% of energy is wasted
by commercial buildings through lighting,
appliance use, heating and cooking [11]. Literature
shows that in the US building account for about
40% of energy consumption and carbon emission
[21]. To make buildings to be more energy
efficient requires investing in ICT that can sense
and monitor energy demands of a building so that
any wastage can be detected earlier. The concept
of smart city of the future is driven by application
of ICT based systems that collects and transmits
information to wherever it is needed. ICT based
application can allow for provision of energy to a
building when the building need it.
3.5 ICT and Transport GHG
Transportation of people and goods are some of
the main activities of modern life. People travel
for leisure, work, business etc. Goods need to be
transported from where they are produced to their
market place. There is significant amount of
energy that is required to power transportation of
goods and people. Currently the main source of
power comes from fossil fuels. The car industry is
responding to carbon emission by producing car
engine that consume less power. The industry is
also pushing for research to make electric cars a
reality for the masses. However, an electric car
runs on a battery that must be charged, hence this
is only environmental friendly if it is charged from
a renewable energy source.
4 ICT CHALLENGES IN GHG REDUCTION
Many countries especially developing countries do
not have GHG emission data and efficiency
standards for machinery. Therefore manufacturing
188
companies do not have a yard stick against which
to monitor their carbon footprint.
Costs of ICT and lack of knowledge in ICT
supported agriculture are some of the challenges
facing ICT adoption. ICT infrastructure is mainly
limited to areas with high population density like
cities and villages. So in developing countries it
will take time for such infrastructure to be made
available at farms as investors consider it to be
unproductive investments to provide such
infrastructure in thinly populated areas.
Green computing is generating interest among
professionals; however it has not yet been fully
implemented as part of the body of knowledge in
computing curriculum. For example Curricula
2013 is silent on this subject [22].
The work from home idea concepts has not fully
taken off. When employees work from home, they
save energy by not travelling to work especially if
their mode of transport is not environmentally
friendly.
The impact of global warming are already
showing their negative impact globally. Draught,
floods, increase in atmospheric temperature and
changes in patterns of seasons are now becoming
common. Global warming does not recognise
geographical boundaries, hence low producers of
carbon emission are equally affected just like the
high producers.
method with less carbon footprint. When the right

information is made available people will be able
to make environmental friendly decisions.
Recent and continuing advances in computing
application have the potential to minimise human
activities or at least make them more efficient. The
banking industry, despite high risks of online
banking has taken a lead in providing its services
online, making them accessible at any place at any
time. This indirectly minimise the need to have
physical buildings of bank branches. In some parts
of the world the ability for retail stores to provide
cash back facilities turns them into mini bank
branches. Once customers do no longer have to
travel to the bank this reduces the banking industry
carbon emissions. Other sectors especially central
governments through the e-government initiatives
can embrace application of green computing to
minimise carbon emissions.
6 REFERENCES
[1]
Renewable Energy World; Types of Renewable

Energy.Retrieved November 15,2013, from Renewable
Energy World: http://www.renewableenergyworld.com
2013.
[2]
Macmillan Dictionary, Carbon emissions-definition.

Retrieved,November24,2013,fromMacmillanPublishers
Limited,http://www.macmillandictionary.com/dictionar
y/british/carbon-emissions 2013.
[3]
FREA, Clean Energy Defined Retrieved January 7th,

2014, from: http://www.cleanenergyflorida.org, 2013
[4]
EPA. Sources of Greenhouse Gas Emissions.

Retrieved November 25, 2013, from United States
EnvironmentalProtectionAgency:http://epa.gov/climatec
hange/ghgemissions/sources.html, 9, September, 2013.
[5]
A. Stickley, Going Green: Installing Solar Panels

around the Campus of Widener University. Chester,
USA, Widener University, 27 November 2012.
[6]
Desert Knowledge Platform, Botswana, Retrieved

November21, 2013, from http://knowledge.desertec.org,
22, February 2012.
[7]
EIA,. Greenhouse gases, climate change, and energy,

Retrieved November 25, 2013, from United States
EnergyInformationAdministration.:http://www.eia.gov/o
iaf/1605/ggccebro/chapter1.html, 2012.
[8]
EIA, Greenhouse Gases Programs,Greenhouse Gases,

Climate Change, and Energy, Retrieved November 25,
2013,fromEnergyInformationAdministration,http://www
.eia.gov/oiaf/1605/ggccebro/chapter1.html, 2013.
5 CONCLUSIONS
Carbon emission needs to be looked as
international tragedy that need to me monitored
and evaluated at central and local government
level. Carbon emission needs to be given highest
priority by policy makers and measure put in place
to monitor it. Carbon emission data need to be part
of data that is collected and shared globally just
like poverty level indicators.
Carbon emission rates data need to be made part
of information of any goods that are produced just
like labelling in the food industry. Carbon
emission data of any human activity need to be
made publicly available to increase level of
awareness among people. For example, when
carbon emission data of running a conference
through video conference and the traditional
method are available, individual may choose the
189
[9]
A. Murad, Bright outlook for renewable energy

Retrieved
January
9,
2014,
from
BBC:
http://www.bbc.com/capital/story/20140106-renewableth
energys-bright-future, 6 , January, 2014
[10]
E. Kirschbaum, Germany sets new solar power record,

institute says Retrieved January 9, 2014, from
http://www.reuters.com/article/2012/05/26/us-climategermany-solar-idUSBRE84P0FI20120526, 26, May,
2012
[11]
BCG,. SMARTer 2020: The Role of ICT in Driving a

Sustainable Future, GeSI,. Retrieved November 10,
2013, from www.gesi.org, December, 2012.
[12]
R. K. Rainer, C. G Cegielski, Introduction to

Information Systems: Enabling and Transorming
Business. Danvers, John Wiley & Sons, 2011.
[13]
Techpedia, Green Computing,. Retrieved November

19,2013,fromhttp://www.techopedia.com/definition/147
53/green-computing, 2013.
[14]
Accenture, Cloud Computing and Sustainability: The

Environmental Benefits of Moving to the Cloud,
Chicago, Accenture,2010
[15]
A. Huth, J. Cebula, The Basics of Cloud Computing.

Carnegie, US-CERT, 2011
[16]
IPCC, Climate Change 2007: Mitigation of Climate

Change . Contribution of Working Group III to the
Fourth Assessment Report of the Intergovernmental
Panel on Climate Change B. Metz, O.R. Davidson, P.R.
Bosch, R. Dave, L.A. Meyer (eds). Cambridge
University Press, Cambridge, United Kingdom and New
York, NY, USA, 2007.
[17]
BPC, Botswana Power Corporation Annual Report

2011, Gaborone, Botswana Power Corporation, 2011.
[18]
Osaka, Osaka Gas. Retrieved October 24, 2013, from

http://www.osakagas.co.jp, 2012.
[19]
World Economic Forum, The Future of Manufacturing:

Opportunities
to
drive
economic
growth,RetrievedNovember26,2013,fromhttp://www3.
weforum.org, April, 2012.
[20]
WorldBank,ICTinAgriculture:ConnectingSmallholders
to knowledge, Networks and Institutions. Retrieved
November26,2013,fromWorldBankGroup:http://www.ic
tinagriculture.org/sourcebook,2012.
[21]
H.Chen, P.Chou, S. Duri, H.Lei, & J.Reason, The

Design and Implementation of a Smart Building Control
System. 2009 IEEE International Conference on eBusiness Engineering (pp. 255-262). Macau : IEEE
Computer Society, 2009.
[22]
IEEE Computer Society, Computer Science Curricula

2013Retrieved November 5, 2013, from Association
forComputingMachinery:http://ai.stanford.edu/users/sah
ami/CS2013/strawman-draft/cs2013-strawman.pdf,
February, 2012.
190
Interference Analysis and Spectrum Sensing of Multiple Cognitive Radio Systems

Sowndarya Sundar and M. Meenakshi
Department of Electronics and Communication Engineering
College of Engineering, Guindy Campus, Anna University
Sardar Patel Road, Chennai, INDIA
E-mail: sowndaryasundar@gmail.com
ABSTRACT
Conventional xed spectrum allocation results in a
large part of frequency band remaining underutilized.
Channels that are dedicated to licensed (primary) users
are out of reach of unlicensed users, while the licensed
users do not occupy the channel completely, at all
times. Cognitive Radio is an attractive concept and
solution, capable of addressing this issue. This paper
investigates a statistical simulation model for spectrum
sensing of cognitive radio and the associated
interference probability calculation methodologies. The
capability to simulate multiple cognitive radio systems
with issues of complex range of spectrum engineering
and radio compatibility are explored. Simulations
were carried out and output parameters such as sensing
received signal strength, cell capacity and achieved
bitrate were obtained and analyzed. The results were
obtained for different conditions with particular
emphasis on CDMA systems and OFDMA systems.
The article also highlights the results obtained for
studying detection threshold and the associated
interference probability.
KEYWORDS
Cognitive radio; spectrum management; software
radios;
self-organizing
networks;
interference
calculation
1 INTRODUCTION
With the exponential increase in the use of high
powered wireless devices, the spectrum
availability or rather, the lack of it is a major
challenge. Conventional xed spectrum allocation
results in a large part of frequency band remaining
underutilized. Channels that are dedicated to
licensed (primary) users are out of reach of
unlicensed users, while the licensed users do not
occupy the channel completely, at all times.
Cognitive radio is a concept that is capable of
addressing this issue and is a form of dynamic
spectrum management that attempts to utilize the

channel in its full capacity.
The chief functions of cognitive radio are
spectrum sensing and management. Through
spectrum sensing, the cognitive radio becomes
aware of its environment and sensitive to its
changes. Spectrum sensing essentially involves
checking within a candidate channel if any
protected service is present and transmitting.
When a channel is found to be vacant, sensing is
typically applied to adjacent channels to identify if
any constraints on transmission power are present.
Spectrum sensing techniques can be classified as
transmitter detection, cooperative detection, and
interference-based detection [1].
A cognitive radio automatically detects the
available channels in wireless spectrum, then
accordingly changes its transmission or reception
parameters to allow more concurrent wireless
communications in a given spectrum band at a
given
geographic
location.
Radio-system
parameters such as frequency, waveform and
protocol are intelligently configured and
monitored to determine the channel conditions and
frequency environment, so that settings are
adjusted to deliver the required quality of service
subject to an appropriate combination of
regulatory constraints, operational limitations,
along with requirements of user.
To detect a primary user as well as to avoid any
false alarm are of paramount importance for such
a system. In reality, it is very difficult for a
cognitive radio to have the information of direct
measurement of a channel between a primary
transmitter and receiver. One way to conduct a
measurement within the candidate channel to find
the presence or absence of any protected service is
having the knowledge of a parameter, called
191
detection threshold. For energy-detector based
spectrum sensing, the threshold is set to
distinguish signal from noise.
In the sensing based spectrum sharing model [2],
in which only the radio-frequency spectrum is
considered, a simple energy detector cannot
guarantee the accurate detection of signal
presence, necessitating complex spectrum sensing
devices and periodical exchange of information
about spectrum sensing. The changes in traffic or
user movement demand updates to be done in the
radio environment. The spectrum holes that are
detected through spectrum sensing is to be
followed up with analysis and decision making to
determine parameters such as mode of
transmission, data rate and bandwidth so that the
spectrum band is chosen in accordance with
requirements of user.
In addition to these technical challenges, the
practical implementation of dynamic spectrum
management must also conform to the rules and
regulations set out for radio spectrum access in
international law as well the legislation specified
by respective countries.
2 RELATED WORK
2.1 Artificial Intelligence for Cognitive Radios
Application of Artificial Intelligent (AI)
techniques to cognitive radio networks is a
promising field of research since such networks
should have the capacity to learn and adapt within
any layer of the radio communication system. The
application of various classes of AI techniques to
cognitive radio networks can be found in the
literature. Ref. [3] presented the adaptive
component, which uses Genetic Algorithms (GAs)
to evolve a radio defined by a chromosome whose
genes represent the adjustable parameters in a
given radio. The GA could find a set of parameters
that optimize the radio for the users current needs
[3]. Ref. [4] described the physical components of
the spectrum sensing network as well as the
experimental system's information storage and
learning mechanisms. It was pointed out that the
spectrum sensing network of the project might be
extended to a universal research platform for
general cognitive radio devices, networks and

applications. The hidden Markov model is applied
for spectrum sensing [5]. An Artificial Neural
Network (ANN) based and a Case Based System
(CBS) based cognitive engines were presented as
case studies in [6]. The paper also noted that a
cognitive engine must carefully be designed,
observing the tradeoff between performance and
complexity determined by the application
requirement.
2.2 Simulators
Simulation greatly helps in seeing how the
systems and networks behave before actual
implementation. Simulation tools have become an
integral part of design and analysis in the domain
of wireless systems. Studies on simulation of
cognitive radio engine and networks have been
reported in the literature. An infrastructure-based
cognitive radio network consisting of one basestation and multiple users was simulated in [7].
Simulation results showed that the proposed
spectrum decision framework provided efficient
bandwidth utilization while guaranteeing the
service quality [7]. The simulator in the project [8]
was built in object oriented language and the
purpose was to simulate what could happen if the
cognitive users were allowed to transmit freely in
reality. The thesis also evaluated the simulator
with some realistic values, simulating radio traffic
for certain frequencies in environments of the size
of cities or big country sides.
Ref. [9] proposed and demonstrated the
advantages of a new random motion generator for
use in ns2 simulator for simulation of wireless
mobile networks. An existing ns-2 network
simulator was extended to support cognitive radio
network simulation in [10]. The article suggested
that simulators such as OPNET, QUALNET were
created for the ordinary wireless network and
hence researchers could not easily implement their
cognitive radio algorithms over those simulators.
A procedure that can be employed to generate
artificial time-frequency spectrum data in
simulation tools was presented in [11]. The
authors assumed a generic simulation scenario and
illustrated a possible simulation method to
192
generate artificial spectrum data based on the
presented models.
of priorities taken in making those assumptions

and simplifications.
A new received signal strength (RSS) estimation

method based on neural network was presented in
[12] for the antenna placement problem in vehicleto-vehicle communications. The article also
suggested that the computation time of the
proposed approach was much less than the ones of
the RSS simulator. Ref. [13] demonstrated how
the existing software radio projects could be
leveraged to use the baseband modem in a wide
range of radio frequency bands. The paper also
described their approach and implementation of
power management and instrumentation, a key
building block in their modularized low-power
software
radio
platform.
Some
current
developments of wireless communication
technology such as short range communication,
cloud computing, bring your own device policy
(BYOD), devices tethering and convergences of
WiFi and cellular network technology were
presented in [14]. Ref. [15] presented a strategy by
means of infrastructure sharing between operators
considering sharing the backhaul network
infrastructure to improve resiliency among the
operators. Despite the resiliency mechanisms,
there are occasions when the network resources
are not available for the end users which
necessitates the need for sharing another
operators backhaul, thus decreasing the overall
unavailability time [15].
Simulations have been done in this article, by

Spectrum Engineering Advanced Monte-Carlo
Analysis Tool; a statistical simulation model
developed originally using the C++ programming
language [16]. The software tool provides
adequate universality including the capability to
directly simulate proliferating Code Division
Multiple Access (CDMA) systems which have
complex power control mechanisms in them.
3 SIMULATION
For radio compatibility, the interference in the
victim receiver input is mostly the unwanted
emissions from the transmitters as well as
blocking and intermediation, considering only the
adjacent bands. The analytical and classical
approaches for the estimation of these interference
mechanisms treat the operation of radio
communications systems static, without taking
into account the user movement on mobile
systems and hence they are rigid and difficult to
implement. Even with reasonable assumptions and
simplifications, the accuracy of such interference
assessment is impacted by the order and settings
The basic principle of the algorithm is centered on

the detection threshold parameter that is used by a
cognitive device to detect the presence of a
protected services transmission, if any. If it
detects no emission above this threshold in a
channel, the white space device (WSD) is allowed
to transmit; otherwise the WSD keeps silent or
looks into other channels. The algorithm assumes
that frequency of the interfering WSD is
dependent on the frequency range defined for the
victim and there are many ways in which the
operating frequency of the victim device can be
defined; constant, discrete as specified by the user
or it can be distributed between frequency limits
dictated by the number of possible channels that
the WSD will operate in. Similarly, the detection
threshold can be defined either as a constant or a
user-defined function.
Once the frequencies that are to be tested are
identified, if the signal transmitted by the wanted
transmitter to the victim receiver is found to be
greater than sensitivity, the signal transmitted by
wanted transmitter and sensed by interfering
transmitter is calculated for each WSD for all its
possible channels as per (1), considering the
unwanted mask of the Digital Terrestrial
Television (DTT). Here, sRSS- the sensing
Received Signal strength; PWt - the transmit power
from the Wanted Transmitter (Wt); fm - the
frequency of the WSD; GWtIt the antenna gain
of the Wt, in the Wt to It direction; GItWt - the
antenna gain of the Interfering transmitter (It) in
the It to Wt direction; L- the path loss in dB
between the It and the Wt.
sRSS (fm) = PWt (fm) + GWtIt + GItWt + L
(1)
193
The sRSS output vector so calculated for each
WSD is shown in Fig.1. The figure shows the
results when five WSDs are considered. This is
followed by the detection of channels in which a
WSD is allowed to transmit by comparing to
threshold as stated earlier. Constraints on
transmission power are then determined and
interference is calculated.
Fig. 2. Frequency density of white space devices; WSD

Frequency (MHz) (vs) Probability density
Fig. 3 depicts CDF of the frequency at which the

victim device transmits per event.
Fig. 1. Sensing received signal strength output for white

space devices (vs) dBm
In the simulation, multiple interferers are set as

cognitive radio devices when the interferer is not
either CDMA or Orthogonal Frequency-Division
Multiple Access (OFDMA). When an interferer is
set as cognitive radio, the power and emission
characteristics of the interfering WSD are input
and if a victim system is detected in the vicinity,
an appropriate operating frequency is selected and
its emission mask is lowered as per the Effective
Isotropic Radiated Power maximum (EIRP max)
limit defined in spectrum sensing characteristics.
Accordingly, detection threshold, probability of
spectrum sensing failure and bandwidth of the
sensing device are defined. The actual frequencies
at which WSDs are allowed to transmit, victim
frequency, sRSS etc. can be displayed in the form
of a vector, cumulative distribution function
(CDF) or probability density function. The
densities of frequencies of various WSDs are as
shown in Fig.2.
Fig. 3. Frequency density of white space devices; victim

frequency (MHz) (vs) CDF
4 RESULTS
Once the output vectors are read, the user can
perform interference calculation in the simulator
engine. The victim link and interfering links were
set up and the results were obtained for different
conditions.
4.1 Co-channel Interference between Fixed
Links
The interferer was set as cognitive radio which
had seamlessly acquired the signal strength since
the receiver and transmitters were fixed lines and
optimally utilized the spectrum in locations not
194
used. The parameter EIRP max was input with
three parameters offset (MHz), Mask (dBm) and
reference bandwidth of DTT (kHz). The
probability Density Function (PDF) function is
shown in Fig. 4.
operation of multiple short range devices deployed

in the same geographic area. The environment
chosen was rural and the antennas were outdoor
and on the rooftop. Provisions were made to set
the wall loss; indoor to indoor and indoor to
outdoor. The simulation results include external
interference due to unwanted signals and
selectivity.
Fig. 4. Probability Density Function for fixed mobile links
4.2 Interference between Mobile Links

When the cognitive radio and the victim link are
mobile, the determination and capture of signal
becomes more time consuming when compared to
fixed lines. The simulator was able to do this
function satisfactorily as demonstrated in Fig. 5.
As can be seen from the figure, not all simulated
WSDs were active subject to an EIRP limit
specified in the spectrum sensing characteristics.
Similarly, the average EIRP per event can also be
displayed.
4.3 Interference to CDMA System
CDMA systems offer high capacities compared to
other competing digital (TDMA) and analog (FM)
technologies. The CDMA capacity is interferencelimited and any reduction in multiple access
interference converts directly and linearly to an
increase in capacity. A scenario was simulated
wherein the victim link was considered to be a
downlink component of CDMA-based mobile
communications system at 1932.5 MHz band,
which was potentially interfered by a co-channel
Fig. 5. Average number of active white space devices for

each frequency
The simulator was able to yield CDMA specific

results in two forms; capacity (Fig. 6) and outage
(Fig. 7). While the capacity graph shows the
number of mobile users optimally served in the
reference cell before and after the introduction of
interference, the ratio of non-connected mobile
users to the total number of mobile users assigned
to that cell before and after the introduction of
external interference is shown in outage plot.
Here, the reference cell is taken to be the centre
cell of CDMA cluster, by default, but any other
cell can also serve as reference. The excess outage
due to impact of external interference is reflected
in the parameter, average capacity loss (in %)
which also indicates whether the interference to
CDMA system is tolerable or not. In interferencelimited systems, outage probability and capacity
195
are fundamental parameters used in system
analysis.
Fig. 6. Results of CDMA interference; cell capacity
One important highlight of CDMA simulation is

that the display is highly interactive where all the
pertinent details about an element such as a mobile
station or a base station as well as the general
statistics for the system as a whole namely, noninterfered capacity, number of generated mobile
users, etc. can be obtained. The data pertaining to
external interferers can also be extracted.
of assigning a variable number of sub-carriers and

calculating the overall traffic per base station, after
the overall two-tiers cellular structure is built and
populated with mobile stations. The UEs (user
equipment) are deployed randomly in the whole
network region according to a uniform
geographical distribution and wrap-around
technique is employed to remove the network
deployment edge effects. The simulation can be
performed for different cell layouts such as 2-tier
3GPP (specifies standards encompassing Radio,
Core Network and Service architecture) and 2-tier
3GPP2 (specifies standards based on CDMA
2000) with specific cell radii. The simulation
results for each case present the evolution of the
achieved bitrate (in kilobits per second) in the
reference cell per event and the evolution of the
achieved bitrate for the whole system per event.
Fig. 8, Fig.9 and Fig.10 show the simulation
results for different cell layouts namely 2-tier
3GPP with a cell radius of 5 km, 2-tier 3GPP with
a cell radius of 0.5 km, and 2-tier 3GPP2 with a
cell radius of 0.5 km. respectively.
Fig. 7. Results of CDMA interference; outage (%)
4.4 Interference to OFDMA system

In OFDMA systems, due to the multiuser diversity
gains and better handling of frequency selective
fading, overall performance is improved. OFDMA
can prevent multipath interference with better
robustness and achieve higher MIMO spectral
efficiency compared to CDMA. The results of the
OFDMA simulation are given in terms of
capacity/throughput loss of the OFDMA victim.
Under this condition, the victim link was
considered to be an uplink component of
OFDMA-based mobile communications system.
OFDMA simulation involves an iterative process
Fig. 8. Achieved bitrate (kbps) for 2-tier 3GPP system with a

cell radius of 5 km; Reference cell (top plot), whole system
(bottom plot)
In addition, simulations can be done and the user

can extract various vectors for post analysis. These
vectors are for the analysis of the achieved bitrate
(with or without external interference) and the cell
capacity (i.e. the number of active users per cell)
with or without interference. The vectors such as
196
average interfered bitrate, average non-interfered

bitrate, coupling loss percentile, interfered
capacity, non-interfered capacity and signal to
interference plus noise ratio (SINR) can be
generated for the reference cell in particular or for
the whole OFDMA system. The achieved bit rate
is calculated as shown in (2).
(2)
Here, x is the spectral efficiency (in bps per Hz)
with respect to calculated SINR and a conversion
factor for bps to kbps also to be included.
2 tiers,
single
sector
2 tiers,
3GPP
2 tiers,
3GPP
2 tiers,
3GPP2
0.5
85.247
27.096
100
11.731
0.5
86.422
9.159
0.5
74.154
8.904
By observing the table, we deduce the relationship

of the average bitrate loss with cell layout and cell
radius. We can see that the lowest bitrate loss
corresponds to the 3GPP2 system with the least
cell radius. The 3GPP and single sector systems
yield higher bitrate losses. On reducing the cell
radius further, we can further reduce the bitrate
loss.
Fig. 9. Achieved bitrate (kbps) for 2-tier 3GPP system with a

cell radius of 0.5 km; Reference cell (top plot), whole system
(bottom plot)
Thus, a summary of the average capacity and

average bit rate loss expressed in percentages for
both the reference cell and the entire OFDMA
network (i.e. the whole system) can be obtained.
The percentage calculation is performed for each
event and the mean of the percentage over all the
events is deduced. The results for the three cases
are shown in Table 1.
Table 1. Average bitrate loss (in %) for different cell layouts
and cell radii
Cell
layout
Cell Radius
(km)
2 tiers,
single
sector
For
Reference
Cell (%)
100
For
OFDMA
system (%)
21.628
Fig.10. Achieved bitrate (kbps) for 2-tier 3GPP system with a

cell radius of 5 km; Reference cell (top plot), whole system
(bottom plot)
Simulations were also performed for a system

with OFDMA uplink victim and OFDMA uplink
interferer. A 2 tier 3GPP2 cell layout was chosen
and a free space propagation model was
considered. Fig. 11 shows the evolution of the
achieved bitrate in kbps for the reference cell and
the entire system. This particular simulation
resulted in an average bitrate loss of 0.382% for
the reference cell and 0.287% for the overall
system.
197
-5db, with a highest value of around 0db, for both

the reference cell and the overall system.
Fig. 11. Achieved bitrate (kbps) for OFDMA UL victimOFDMA UL interferer system; Reference cell (top plot),
whole system (bottom plot)
We can also analyze the signal-to-interferenceplus-noise ratio (SINR) of the victim reference
cell and the victim system from the simulation
results. Fig. 12 and Fig. 13 depict the probability
density for different values of SINR for the victim
reference cell and the victim system respectively.
Fig. 13. Probability density function (%) (vs) SINR(db) for

the victim system.
4.5 Detection threshold analysis and

interference calculations
A scenario was set up involving a single victim
link and several white space devices. The
interferers were set as cognitive radios and the
parameters of the victim and the interfering links
were adjusted appropriately. The simulation was
performed for different values of detection
threshold and the interference calculations were
obtained. The simulation results shown are for
three different values of detection thresholds,
namely, 50, 0 and -100. Fig. 14 and Fig 15 show
the results for a detection threshold of 50.
Fig. 12. Probability density function (%) (vs) SINR(db) for

the victim reference cell.
The figures show that the SINR levels due to

interference from the OFDMA UL interfering
links for the particular scenario are mainly around
Fig. 14. Probability (%) (vs) blocking response level/victim

link for detection threshold of 50
198
The type of interference signal chosen for
calculations was both unwanted and blocking. The
calculations were performed in both the
compatibility mode and translational mode to aid
in analysis.
value for each detection threshold, are shown in

Table 2.
Table 2. Interference probability in compatibility mode for

different values of detection threshold
Detection threshold
50
0
-100
Interference probability
(%)
0.07
0.06
0.0
Fig. 15. Probability (%) (vs) power supplied/ CRS transmitter

for detection threshold of 50
Fig. 16 and Fig. 17 display the results for a

detection threshold of 0 and Fig. 18 and Fig. 19
for a detection threshold of -100.
Fig. 18. Probability (%) (vs) blocking response level/victim
link for detection threshold of -100
Fig. 16. Probability (%) ( vs) blocking response level/victim

link for detection threshold of 0
The figures 14, 16 and 18 indicate the probability

of interference as a function of blocking response
level of the victim receiver whereas the figures 15,
17 and 19 indicate the probability of interference
as a function of output power of the interfering
transmitter.
The simulation results showed that the probability

of interference reduces with reduction in detection
threshold and a detection threshold of -100 or less
yield zero probability of interference for the
scenario. Instead of choosing a constant detection
threshold value, the threshold can be made
adaptive or as a function of other parameters in
order to yield better results.
Fig. 19. Probability (%) (vs) power supplied/CRS transmitter

for detection threshold of -100
5 DISCUSSION and FUTURE WORK
Fig. 14. Probability (%) (vs) power supplied/ CRS transmitter

for detection threshold of 0
The summary of the compatibility results which

comprises of a single interference probability
The simulation model supported multiple radio

systems and compatibility. The spectrum sensing
phenomenon where the interfering devices try to
detect the presence of protected services
transmitting in each of the potentially available
channels was incorporated in the model by
inclusion of detection threshold and the selection
199
of operating frequency of the white space device.
The simulator model uses an effective algorithm
to detect the presence or the absence of a protected
services transmission. If it detects no emission
above this threshold in a channel, the white space
device is allowed to transmit; otherwise the white
space device keeps silent or looks into other
channels.
The simulator model took spatial and temporal
distributions of the received signals into account
and the statistical probability of interference was
examined for a wide variety of scenarios. The
various output features such as sensing received
signal strength, WSD frequency and its density,
CDF of victim frequency, EIRP, average EIRP per
frequency and average active WSDs per frequency
gave useful information about interference
assessment. In the case of an interferer being set as
a cognitive radio, the spectrum sensing
characteristics included EIRP max which is an
important parameter to see if a radio solution is
within the values allowed by local regulatory
bodies.
The model was able to simulate more precise
mutual positioning of the systems under
consideration, hence demonstrated the efficiency
of use of radio spectrum. It was seen that the
issues of complex range of spectrum engineering
and the corresponding radio compatibility were
able to be dealt with. Though the model exhibited
highly interactive features in simulating CDMA
and OFDMA systems including that of individual
element details and whole system characteristics
such as average capacity loss and average bitrate
loss, it lacked the ability to simulate
CDMA/OFDMA as victim and cognitive radio
interferer. This unconventional setting can be
included in the future algorithm. The time
complexity of calculations in the CDMA/
OFDMA module can also be improved with
enhancement in the detection algorithm.
In addition, interference analysis and compatibility
for different cases were explored and analyzed
with the help of the Interference Calculation
Engine. Adaptive detection threshold values can
be implemented and studied for better interference
management. A more accurate interference

assessment also would pave the way for the design
of efficient interference mitigation mechanisms.
6 CONCLUSION
Cognitive radio is capable of addressing the
challenges associated with the fixed spectrum
assignment policies in todays wireless domain. It
has to be able to change the waveform and
operational parameters in accordance with
environmental and user changes and to examine
sensory input, to learn and adjust inner operations.
The chief functions of cognitive radio are
spectrum sensing and management. While the
cognitive radio becomes aware of its environment
through spectrum sensing, the learning and
adaptability is attained by having artificial
intelligence capabilities.
Simulation tools have become an integral part of
design and analysis in the domain of wireless
systems. This article investigated a statistical
simulation model, its algorithm and output
features in detail and found that it provided
adequate universality including the capability to
simulate CDMA systems that have complex power
control mechanisms. In this paper, OFDMA
specific results were analyzed for different cell
layouts and cell radii and the average bitrate
losses, for the reference cell as well as the overall
system, for some of the cases were compared. The
model supported sufficient interference probability
calculation though its engine. This allowed the
simulation of different scenarios by varying the
detection threshold and monitoring the
interference
probability using both
the
compatibility and translational modes.
The practical implementation of dynamic
spectrum management must also conform to the
rules and regulations set out for radio spectrum
access in international law as well the legislation
specified by respective countries. With more
research and refinement happening in spectrum
sensing, decision and management, the resulting
simulation models of cognitive networks will
shape the communication technologies and
protocols of future.
200
7 REFERENCES
1.
Akyildiz, I. F., Lee, W. Y., Vuran, M. C., Mohanty, S.:

NeXt Generation/Dynamic Spectrum Access/Cognitive
Radio Wireless Networks: A Survey. Computer
Networks 50, 2127-2159 (2006).
2.
Kang, X., Liang, Y., Garg, H. K., Zhang, L.: SensingBased Spectrum Sharing in Cognitive Radio Networks.
IEEE Transactions on Vehicular Technology 58, 46494654 (2009).
3.
Rondeau, T. W., Le, B., Rieser, C. J., Bostian, C. W.:

Cognitive Radios with Genetic Algorithms: Intelligent
Control of Software Defined Radios. In: Proc. SDR 04
Technical Conference and Product Exposition, Phoenix,
USA (2004).
4.
5.
Brendel, J., Riess, S., Stoeckle, A., Rummel, R.,

Fischer, G.: A Spectrum Sensing Network for Cognitive
PMSE Systems. Jondral, F. K., Mhlhaus, M. S. (eds.),
Frequenz, vol. 66, no. 9-10, pp. 269-278 ( 2012).
Kim, K., Akbar, I. A., Bae, K. K., Urn, J., Spooner, C.
M., Reed, J. H.: Cyclostationary approaches to signal
detection and classification in cognitive radio. In: Proc.
the 2nd IEEE International Symposium on New
frontiers in Dynamic Spectrum Access Networks
(DySPAN), pp.212215, Dublin, Ireland (2007).
6.
He, A., Bae, K., Newman, T., Gaeddert, J., Kim, K.,
Menon, R., Morales, L., Neel, J., Zhao, Y., Reed, J.,
Tranter, W.: A Survey of Artificial Intelligence for
Cognitive Radios. IEEE Transactions on Vehicular
Technology, Special Issue on Cognitive Radio 59,
1578-1592 (2010).
7.
Lee, W., Akyildiz, I. F.: A Spectrum Decision

Framework for Cognitive Radio Networks. IEEE
Transactions on Mobile Computing 10, 161-174 (2011).
8.
Sundman, D.: A Simulator for Cognitive Radio.

Masters degree project, KTH Electrical Engineering,
Stockholm, Sweden (2008).
9.
Alghamdi, R., DeDourek, J., Pochec, P.: Evaluation and

Improvements to Motion Generation in ns2 for Wireless
Mobile Network Simulation. International Journal of
Digital Information and Wireless Communications 2,
84-92 (2012).
11. Lpez-Bentez, M., Casadevall, F.: Spectrum Usage

Models for the Analysis, Design and Simulation of
Cognitive Radio Networks. In: Venkataraman, H.,
Muntean, G. M. (eds.), Cognitive Radio and its
Application for Next Generation Cellular and Wireless
Networks, Lecture Notes in Electrical Engineering,
DOI:
10.1007/978-94-007-1827-2_2,
Springer
Science+Business Media, Dordrecht (2012).
12. Talepour, Z., Kondori, H., Barakati, M., Mehrjoo, M.,
Shokouh, J. A.: Received Signal Strength Estimation in
Vehicle-to-Vehicle Communications Using Neural
Networks. International Journal of Digital Information
and Wireless Communications 3, 42-47 (2013).
13. Szilvsi, S., Babjk, B., Ldeczi, A., Vlgyesi, P.:
Towards a Versatile Wireless Platform for Low-power
Applications. International Journal of Digital
Information and Wireless Communications 1, 401-414
(2011).
14. Noor, M. M., Hassan, W. H.: Wireless Networks:
Developments,
Threats
and
Countermeasures.
International Journal of Digital Information and
Wireless Communications 3, 119-134 (2013).
15. Philip, V. D., Yvon, G., Djamal, Z.: Infrastructure
Sharing: A Cost Effective Alternative for Resiliency in
4G-LTE Mobile Networks. International Journal on
New Computer Architectures and Their Applications 2,
113-126 (2012).
16. http://seamcat.iprojects.dk/wiki/Manual
10. Chigan, C.: Development of NS-2 Based Cognitive

Radio Cognitive Network Simulator. Co-PI: Project
Partially Supported by US Army CommunicationsElectronics Research Development and Engineering
Center (CERDEC).
201
A License Management System for Content Separate Delivery over P2P Network
Masaki Inamura and Keiichi Iwamura
Department of Electrical Engineering, Graduate School of Engineering, Tokyo University of Science.
6-3-1 Niijuku, Katsushika-ku, Tokyo, 125-8585 Japan.
minamura@sec.ee.kagu.tus.ac.jp, iwamura@ee.kagu.tus.ac.jp
ABSTRACT
We propose a license management system for content
separate delivery over P2P network. In trivial content
delivery systems, a content file is delivered to a user
bound license data within content-key simultaneously.
However, a server needs to administer not only user
license but also content files, and so server's load is
heavy in these systems. In our proposed system, users
can obtain a content file not only a content delivery
server but also other and can be licensed for the
content-key separated content delivery. Therefore, our
proposed system is lightweight and scalable.
Furthermore, for the purpose to show the feasibility,
we have implemented a prototype system. As the result
of the evaluation, we show our proposed system is
secure and practical.
KEYWORDS
Digital rights management, Peer-to-peer, Content
delivery, Separate delivery, License Management
1 INTRODUCTION
Recently, the internet becomes common
technology and makes it possible for users to do
their work efficiently. Especially, as a result of the
development of broadband networks, cellular
phone networks and wireless networks, a digital
content delivery service over the computer
network is widely penetrated.
Nowadays most of current content delivery
systems are based on the server-client model
generally, e.g. iTunes Store [1] and YouTube [2].
In this model, content and license information are
stored in the server of a content provider.
Therefore, the content provider should handle
huge amount of traffic data in response to users
request. On the other hand, content distribution
over P2P (Peer-to-Peer) network is promising
model because P2P communication avoids to

concentrate traffic on the server and is also an
alternative business model to market the rich
content. Accordingly P2P content distribution
system such as BitTorrent [3] and SkeedCast [4]
are highlighted. However, current P2P content
distribution systems cannot handle digital rights
bound to the content. Eventually, piracy caused by
illegal use of P2P content distribution systems has
become serious problem. For the further diffusion
of content distribution services over P2P network,
it is urgently necessary to construct a method of
prevention against illegal content distribution.
One of problems about copyright protection is
illegal copy from content file. Current DRM
(Digital Rights Management) methods need to
manage content/content-key on a server and to
send encrypted content-key with the user-key to
the user for prohibiting other user from using this
content file without content-key. In this method,
one server (or one group of server administered by
same license administrator) administers all users'
license and content files and protects copyrights
against illegal content distribution. So this method
is adapted to P2P content distribution systems
directly, because a user can obtain content files
from other users. Up to now, there is no DRM
mechanism that is suitable for P2P content
distribution systems. For the purpose to deploy
P2P content distribution systems securely, DRM is
necessary function to be provided.
Taking into account of the above background, we
propose a secure content delivery system that is
optimal for the P2P network, where multimedia
content can be securely delivered by means of P2P
communication while license administrator issues
the license with a small amount of network
resource and computational power. More precisely,
our proposed system is based on so-called separate
delivery model, namely a user can send encrypted
202
content to the other user over P2P network, and
the other user can decrypt the received content by
use of the license information that is from the
license administrator. An outstanding feature of
the proposed method is that the license
administrator only manages one master secret and
only re-binds a content-key to the target user on
the request basis. Therefore, our proposed system
is light-weight and scalable. Furthermore, for the
purpose to show the feasibility, we have
implemented a prototype system. As the result of
the evaluation, we show our proposed system is
secure and practical.
2 RELATED WORKS
One of fundamental methods in DRM systems is
Super-distribution [5]. Super-distribution model
permits users to copy content and a content
provider collects a charge for each content data.
As a result, a content provider can realize content
distribution service through many kinds of
networks. Therefore, recent content distribution
services tend to use Super-distribution model. In
the Super-distribution model, however, each user
has to connect to the license administration server
whenever the user enjoys one or more contents.
For this reason, the license issuing burden of
license administrator in Super-distribution model
is more than that in the other content distribution
model. Namely the license administration server
in Super-distribution model must be overloaded.
No efficient mechanisms to solve the above
problem are proposed.
Super-distribution model is also based on the
separate delivery where a license is issued
separately with content distribution. Therefore, the
model is applicable to P2P network because
content can be delivered by means of peer to peer
communication while the license is issued by
using direct connection between the license
administrator and a user. When content providers
distribute content to users over P2P network, the
traffic on distributing content over the whole P2P
network is smoothing. Therefore, the total number
of transaction on content distribution will increase.
Eventually, it is expected that transaction on the
license issuing is also increasing. For this reason,
we think that it is important to reduce the license

issuing process in license administration server.
To reduce the overload in the server, the method
where the license administrator can delegate
license issuing process to users terminals is
proposed [68]. In this method, a user who has an
encrypted content and its license can re-encrypt
and distribute this content to the other user on
behalf of the content providers, namely it is based
on the combined model. As a result, the content
providers do not need to distribute content from a
particular server. However, because a user can
distribute content without a license notification,
the license administrator should provide other
methods on license control.
Meanwhile other method, in which a content
provider can specify a limit of a transfer period
and a user cannot use the content when the term
expires, is proposed [915]. In this method, a
trusted server, which can re-encrypt content to the
other user for license verification is introduced,
and a content-key can expires after limited date
due to re-encryption of content. Therefore, a
content provider can control indefinite distribution
of content. A user needs to receive license from
license administrator because of continuous
content use, and so the administrator can receive
information, which is used for charging. However,
a user can use content until limited date and a
license administrator or content providers cannot
know user information within this term.
Furthermore, this method need extra third party
besides user terminals and servers, which are
under license administrator control.
3 PROPOSED SYSTEM
3.1 Service Model
We show proposed service model in Figure 1.
We suppose that our model has a license
administration class and a user class over
computer network.
(1) License Administration Class
A license administrator class is a group of servers
for license administration. Each server within this
class has one of the three functions: for providing
content, for content-key issuance to users and
203
Figure 1. Service model.
charging from users and for content administration

instead of content providers, charge collection
from content-key issue servers and charge
dividing among content providers. We name a
server for each function Content provider,
Content-key issue server and Content
administration server. Regarding the connection
among these servers, P2P network is not
structured.
Furthermore, Content-key issue servers have
only common master-key among entities in license
administrator class and do not have/administer
content-keys and user-keys. Therefore, license
administrator can increase the number of contentkey issue servers according to a scale of service.
(2) User Class
User class has plural users on the network and
these users connect each other over P2P network.
Furthermore, users can connect servers in
License Administrator Class over direct
communication.
(3) Communication and Transaction in/between
License Administrator Class and User Class
A user can obtain a Content-File, shown in section
3.2, from a content provider directly or from other
204
Figure 2. Content-File format.
user with P2P connection. In this paper, we

suppose that any users can obtain a Content-File
and request content-key issuance without
procedure of entry into services.
If a Content-File has not been distributed yet or a
user cannot find this Content-File in user class,
this user obtains this Content-File from a content
provider directly. If this Content-file has already
been distributed and a user can find this ContentFile in user class, this user obtains this ContentFile from other user. After new user obtains a
Content-File, this user picks out content-key
encrypted with former users user-key and this
former users user-key (and metadata) encrypted
with master-key from this Content-File.
Continuously this new user sends his/her user-key
and picked data in this Content-File and to a
content-key issue server. A content-key issue
sever re-encrypts the content-key with new users
user-key and this new user-key (and metadata)
with master-key, and sends created data to this
new user. In the same time, this new user pays a
charge to a content-key issue server. A contentkey issue server sends this charge and content
information in metadata to a content
administration
server,
and
this
content
administration server divides this charge
according to content information. However, in this

paper, the matter of charging is beyond the scope
of our method, and so we omit charging protocol
(including charge collection/dividing).
3.2 Outline of Content-File Format
In the present content distribution system, the
license administrator stores all content-key and
administers license. Therefore, users need to
request content-key or license issue from this
administrator directly. Furthermore, the license
administrator controls content distribution or deny
illegal distribution/use. However, the license
administrator needs to search a content-key in the
huge content-key database on the administrators
server and issuing a license whenever the
administrator receives requests from users. This
present method concentrates process of contentkey issuing in specified servers, and we expect
that this method causes the overload of these
servers if the number of content increase in large
quantities and/or many users request content-key
issuance within a short period. Therefore, that
present DRM is not realistic under a large quantity
of content distribution over P2P network.
205
Concerning this problem, we propose a new key
management and authentication mechanisms for
efficient content distribution service. In the
mechanism, we define a file set including contentkey, license data (user-key and meta-data), content
and other data, which we call Content-File. In this
Content-File, each data is encrypted with
encryption key where only the entity that has a
master-key can decrypt and obtains all data in
Content-File. If the license administrator has the
master-key, the administrator can get content-key
and license data from Content-file without
searching process. Therefore, we can decrease the
burden of the license administrator.
We show a proposed Content-File format in
Figure 2.
The license administrator manages servers that
hold the master-key, which we call content-key
issue server. In case that a user who obtains the
Content-File, the user decrypts a content-key with
his/her user-key and decrypts content with the
decrypted content-key.
For distribution of content, a user sends the
Content-File to the other users over P2P network,
after the re-encryption of the Content-File with
temporal user-key. The other user who receives
the Content-File, requests the content-key issue
server to re-encrypt the content-key in this
Content-File with his/her user-key for the purpose
of using this content. After the receiving user
receives the re-encrypted content-key from the
server, the user enables to use the content.
3.3 Protocol
In this chapter, we propose a protocol for realizing
our service model. The protocol consists of three
parts; content direct distribution, content-key
issuance and content distribution through P2P
connection. In content direct distribution protocol,
we show that a user obtains a Content-File from a
content provider directly when this Content-File
has not been distributed yet among users or a user
cannot find this Content-File over P2P. In content
distribution protocol through P2P connection, we
show that a user obtains a Content-File through
P2P connection when a user can find this ContentFile over P2P. In content-key issuance protocol,
we show a user obtains his/her own Content-File
from a content-key issue server by re-encryption

of content-key with his/her user-key. Generation
of user-key is based on our previous method,
which has a function of content binding with user
terminal by the terminal ID.
We define assumptions and notations of this
protocol as follows:
Assumptions:
Servers in license administrator class are
honest and trustful.
Network in license administrator class is
secure.
All content-key issue servers have a common
master-key.
Each user terminal has his/her own user-key.
Notations:
Ku 1st , Ku 2nd , Ku for : User-key (1st is a user
obtaining content from a content provider
directly, 2nd is a user obtaining this content
from former user with P2P connection, and
for is a former user who wants to content
distribution with P2P connection).
Kt n : Shared temporal key for connection between
a server in license administrator class and a
user in user class or among users terminals
(n is numeral).
Km: Master-key.
Kud n : Temporal user-key for content distribution.
Kc: Content-key.
SENC (m,k): Encrypted message (m) with
symmetric encryption using key (k).
Cont: Content data.
Meta: Metadata regarding content data.
CID: Content ID.
p x , g x , s x : Prime number (p), generator (g) and
random value (s) used in Diffie-Hellman Key
Exchange algorithm for generating Kt n (x is
numeral).
pu x , gu x , su x : Prime number (pu), generator (gu)
and random value (su) used in DiffieHellman Key Exchange algorithm for
generating user-key (x is numeral).
ID 1st , ID 2nd : Terminal ID within users terminal
(1st is a user obtaining content from a
content provider directly, and 2nd is a user
obtaining this content from former user with
P2P connection).
f (*): Transforming function of terminal ID.
206
Figure 3. Content Direct Distribution.
||: Concatenation.
(1) Content Direct Distribution:
The content direct distribution protocol consists of
two phases; setup of the Content-File for
distribution and distribution of the Content-File to
a user. We show the content direct distribution
protocol in figure 3.
[Phase 1] Setup of the Content-File for
distribution:
1. Content Provider > Content-key Issue Server:
SENC(Cont,Kc) and Kc with concatenation.
2. Content-key Issue Server > Content Provider:
SENC(Kc,Kud 1 ) and SENC(Meta||Kud 1 ,Km)
with concatenation.
3. In Content Provider:
Setup
of
SENC(Meta||Kud 1 ,Km),
SENC(Kc,Kud 1 ) and SENC(Cont||Kc).
[Phase 2] Distribution of the Content-File:
1. User > Content Provider:
Content request.
2. Content Provider > User:
r 0 , g 0 , p 0 and g 0 s mod p 0 with concatenation.
r 1 , g 0 s mod p 0 and SENC(r 0 ,Kt 0 ) with
concatenation.
(Kt 0 is g 0 s s mod p 0 .)
4. Content Provider > User:
SENC(r 1 ,Kt 0 ) (after verification that r 0 is
decrypted from SENC(r 0 ,Kt 0 ) correctly.)
0
0 1
207
Figure 4. Content-key Issuance.
6.
7.
SENC(CID,Kt 0 ) (after verification that r 1 is

Content Provider > User:
SENC(SENC(Meta||Kud 1 ,Km)
||SENC(Kc,Kud 1 )
||SENC(Cont,Kc),Kt 0 )
In User:
Setup of SENC(Meta||Kud 1 ,Km),
SENC(Kc,Kud 1 ).
(SENC(Cont,Kc). has already been set up.)
(2) Content-key Issuance:

User cannot know Kud 1 in the data of the phase27 in (1), and so User cannot use this Content-File.
Content-key issuance protocol is for using this file.
This protocol consists of one phase. We show this
protocol in figure 4. In this section, a content-key
issue server sends content-key encrypted with
Ku 1st to 1st user as an example.
[Phase] Setup of the seeds for user-key and

Content-File for user:
1. User > Content-key Issue Server:
Content-key request.
2. Content-key Issue server > User:
concatenation.
4. Content-key Issue Server > User:
Ack (after verification that r 3 is decrypted
from SENC(r 3 ,Kt 1 ) correctly.)
6. Content-key Issue Server > User:
SENC(gu x ||pu x ||gu x sux mod pu x ,Kt 1 )
7. In User:
2
2 3
208
Setup of gu x , pu x and gu x sux mod pu x .
8. In User:
Generating gu x f(ID1st) mod pu x ,Kt 1 and Ku 1st .
(Ku 1st is gu x suxf(ID1st) mod pu x .)
||SENC(Kc,Ku 1st )
||gu x f(ID1st) mod pu x ,Kt 1 )
10. Content-key Issue Server:
SENC(SENC(Meta||Ku 1st ,Km)
||SENC(Kc,Ku 1st ),Kt 1 )
11. In user:
Setup of SENC(Meta||Ku 1st ,Km),
SENC(Kc,Ku 1st )
When using the Content-File, the user
dynamically generates Ku 1st by gu x sux mod pu x in
seeds for user-key and ID 1st .
If a user needs to pay a charge for content use,
this user pays a charge to the content-key issue
server. The content administration server collects
metadata and charges for content use from
content-key issue servers at a fixed pace, and
divides these charges among content providers
under metadata.
(3) Content Distribution through P2P Connection:
The content distribution through P2P connection
protocol consists of two phases; setup of the
Content-File for distribution and distribution of
the Content-File to a user. We show the content
direct distribution protocol in figure 5.
[Phase 1] Setup of the Content-File for
distribution:
1. Former User > Content-key Issue Server:
Distribution request.
2. Content-key Issue server > Former User:
concatenation.
4. Content-key Issue Server > Former User:
4
4 5
6.
7.
SENC(SENC(Meta||Ku 1st ,Km)

||SENC(Kc,Kd 1st ),Kt 2 )
(after verification that r 5 is decrypted from
SENC(r 5 ,Kt 2 ) correctly.)
Content-key Issue Server > Former User:
||SENC(Kc,Kud 2 ),Kt 2 )
In Former User:
Setup of SENC(Meta||Kud 2 ,Km) and
SENC(Kc,Kud 2 )
[Phase 2] Distribution of the Content-File to a

user:
A) New User > Former User:
Content request.
B) Former User > New User:
C) New User > Former User:
concatenation.
D) Former User > New User:
E) New User > Former User:
SENC(CID,Kt 3 ) (after verification that r 7 is
F) Former User > New User:
||SENC(Kc,Kud 2 )
||SENC(Cont,Kc),Kt 3 )
G) In New User:
Setup of SENC(Meta||Kud 2 ,Km),
SENC(Kc,Kud 2 ) and SENC(Cont,Kc).
6
6 7
If this new user wants to use received ContentFile, the user follows same procedures using ID 2nd
in (2), and so the user received re-encrypted data,
SENC(Meta||Ku 2nd ,Km) and SENC(Kc,Ku 2nd ).
4 DISCUSSION
4.1 Security
In our proposed method, for the purpose of
protecting content distributed in this model against
illegal use, the following is required:
209
Figure 5. Content Distribution through P2P Connection.
A) Wiretapping Impossibility:
The content, which is obtained through
wiretapping, cannot be used on terminals
without permitted terminal.
B) Illegal Distribution Process Impossibility:
Even if users or third parties receive content
without running legal process, they cannot use
it.
C) Replay Attack Impossibility:
The content cannot be used by using other

license information for other content.
D) Privacy Protection:
Entities of license administrator class and
third parties cannot know a channel through
which a user obtains content.
We discuss them in the follows:
A) Even if the Content-Files are obtained through
wiretapping, malicious person cannot obtain
210
content-key, user-key and master-key, so
illegal use of content is protected.
B) Even if the Content-Files are obtained, any
person cannot use content without procedures
in section 3.3 (2) because of encryption, so
illegal use of content is protected.
C) A Content-file which a user obtains from a
content provider directly or a former user
through P2P is encrypted with temporal userkey for content distribution, and so this user
cannot decrypt this Content-File with any
user-key for other Content-File because of
difference from the temporal user-key.
D) ID of a user terminal is used on user-key, and
a content-key issue server can know
gu x f(ID1st) mod pu x . However, this server cannot
obtain original ID. Information, which
identifies a user, is not sent except a seed of
this user-key in our protocol, so user privacy
about a channel through which a user obtains
content is protected against entities of license
administrator class.
In this simulation, if a user and a user-key issue

server generate seeds of user-key each time a user
start procedures in section 3.3 (2), re-encryption
throughput is 1.93 seconds. Considering this result,
we think that this throughput is enough fast to put
into practice. Furthermore, if a user and a user-key
issue server prepare seeds of user-key with
advance generating, re-encryption throughput is as
10 times fast as throughput under condition of
each time generating.
In general content distribution system, because of
intensive key issuance, it is expected that
throughput is lost if many sessions are connected.
However, in our method, because user-key issue
servers stores only master-key and do not need to
administer user-keys or content-keys, so we can
establish some user-key issue servers according to
the scale of user easily. Therefore, this method can
share user-key issuance among some user-key
issue servers and lighten to lose performance.
With all evaluation, we think that this proposed
method is enough practical with confidence.
4.2 Evaluation of Re-encryption with User-key
5 CONCLUSION
In our protocol, the throughput of re-encryption

with user key, procedure in section 3.3 (2), takes
effect on the performance. We explain the
simulation program under the proposed system
and evaluate the throughput of this re-encryption.
Environment of simulation is following:
Content-key issue server:
Pentium4 2.4GHz, Memory 1GB,
Windows 2000;
User terminal:
Pentium4 2.26GHz, Memory 1GB,
Windows 2000;
Network:
100BASE-TX;
Security Algorithm:
Symmetric Encryption: AES-128;
One-way Hash Function: SHA-1;
We show result of throughput of re-encryption in
table 1. The values of follows are average values
in ten times evaluation.
We proposed new content distribution method

over P2P network. In this method, entities of
license administrator class do not need to
administer user-key and content-key, and so we
think our method is able to distribute a lot of
content over P2P. Furthermore, content are
encrypted with user-key as Content-File, and so a
user needs to re-encrypts this Content-File with
his/her own user-key when a user uses content.
Therefore, illegal content distribution is denied,
and so content providers securely distribute
content over P2P
Furthermore, we made simulation program and
evaluated the throughput of re-encryption with
user-key. As a result, we could confirm that our
method was realistic about content distribution
over P2P.
In the future, we will examine possibility that
those entities in license administrator class are
realized over P2P. Furthermore, we will make
content distribution system over P2P and confirm
that the proposed system is of practical use. And
on the purpose of simple implementation, we will
revise our program.
Table 1. Throughput of re-encryption with user-key.

Each time generating of user-key seeds.
1.93[sec]
Preparation of user-key seeds.
0.11[sec]
211
REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
iTunes Store, http://www.apple.com/jp/itunes/store/

YouTube, http://www.youtube.com/
BitTorrent, http://www.bittorrent.com/
SkeedCast, http://www.skeedtools.com/
Mori, R., Kawahara, M.: Superdistribution: The
Concept and the Architecture. In: IEICE Trans.,
Vol.E73, No.7, pp. 11331146 (1990).
Manbo, M., Okamoto, E.: Proxy Cryptosystems:
Delegation of The Power to Decrypt Ciphertexts. In:
IEICE
Trans.
Fundamentals
of
Electronics,
Communications and Computer Sciences, Vol.E80-A,
No.1, pp 5463 (1997)
Blaze, M., Bleumer, G. Strauss, M.: Divertible
Protocols and Atomic Proxy Cryptography. In:
Advances in Cryptology - EUROCRYPT 98, LNCS
1403, pp. 127144, Springer-Verlag (1998)
Jakobsson, M.: On Quorum Controlled Asymmetric
Proxy Re-encryption. In: Public Key Cryptosystems PKC 99, LNCS 1560, pp. 112121, Springer-Verlag
(1999)
Nishikawa, H., Miyaji, A., Soshi, M., Omote, T.: A
Secure and Flexible Digital Contents Building System.
In: Proc. International Symposium on Information
Theory and Applications - ISITA 2002, S6-1-4, pp. 223226 (2002)
Cheng, S., and Rambhia, A.: DRM and Standardization
- Can DRM Be Standardized?. In: Digital Rights
Management - Technological, Economic, Legal and
Political Aspects, LNCS 2770, pp. 162177, SpringerVerlag (2003)
Watanabe, Y., Masayuki, N.: Access Control for
Encrypted Data in P2P Data Sharing. In: IPSJ Journal,
Vol.44, No.10, pp. 24372443 (2003)
Josephson, W. K., Sirer, E. G., Schneider, F. B.: Peerto-Peer Authentication with a Distributed Single SignOn Service. In: Peer-to-Peer Systems III - IPTPS 2004,
LNCS 3279, pp. 250258, Springer-Verlag (2004)
Serr, C., Dias, M. S., Delgado, J.: Digital Object
Rights Management - Interoperable Client-side DRM
Middleware. In: Security and Cryptography SECRYPT 2006, pp. 229236, INSTICC (2006)
Serr, C., Dias, M. S., Delgado, J.: Secure License
Management -Management of Digital Object Licenses
in a DRM Environment. In Security and Cryptography
SECRYPT 2007, pp. 251256, INSTICC (2007)
Inamura, M. Tanaka, T.: Implementation and Evaluation
of New Illegal Copy Protection - Protection Against
Making a Illegal Copy of a Copy. In: Security and
Cryptography - SECRYPT 2007, pp. 427432,
INSTICC (2007)
212
Robust Nonlinear Composite Adaptive Control of Quadrotor

Bara J. Emran1 and Aydin Yesildirek2
1
With the Mechatronics Graduate Program
2
Associate Professor with the Department of Electrical Engineering
1,2
American University of Sharjah, UAE
1
Bara.Emran@yahoo.com, 2Ayesildirek@aus.edu
ABSTRACT
A robust nonlinear composite adaptive control
algorithm is done for a 6-DOF quadrotor system.
The system is considered to suffer from the
presence of parametric uncertainty and noise
signal. The under-actuated system is split to two
subsystems using dynamic inversion. A sliding
mode control is controlling the internal dynamics
while the adaptive control is controlling the fully
actuated subsystem. All the plant parameters such
as mass, system inertia, thrust and drag factors are
considered as fully unknown and vary with time.
The composite adaptive control is driven using the
information from two errors; tracking error and
prediction error. An enhancement on the adaptive
control has been done using robust technique to
reject the presence of the noise. The stability of
the closed-loop system is driven in the flight
region of interest. Also the performance of the
designed controller is illustrated to follow a
desired position, velocity, acceleration and the
heading angle of quadrotor despite the fully
unknown parameters and noise measurement.
KEYWORDS
Nonlinear Quadrotor Control, Under-Actuated
system, Composite Adaptive Control, Unknown
Parameters, Robust Control.
1
INTRODUCTION
A lot of interest in developing a control

algorithm for quadrotor has been grown lately;
this is because of its low cost, the high ability for
maneuver and vertically teak off and landing
which make it very popular as a research platform
especially for indoor applications [1]-[5].
However, Quadrotors, like several different

dynamic systems to be controlled, have unknown
or slowly varying parameters. This problem could
make such systems unstable and harder to be
controlled. As an example, firefighting air craft
suffer from considerable mass changes as they
load or unload large quantities of water.
Adaptive control is an admirable candidate for
this type of systems because of its capability of
tracking a desired output signal with the presence
of parametric uncertainties. As explanation, if the
plant parameters are exactly known, the controller
should make the plant output matching to that of
the reference model. However, if they are not
known, the adaptive law adjusts the controller
parameters to achieve an acceptable convergence
to desired output tracking. It is important to
mention that, the convergence of the parameters to
the exact value depends on the richness of the
input signal. In this paper, a so called model
reference adaptive control (MRAC) method is
used.
Another main problem related to the parametric
uncertainties is parameter drift. Parameter drift is
mainly caused by the measurement noise. And
since the Adaptive law uses the measured output
signal as information to adjust the control
parameter, the presence of the noise signal will
affect the adaptation mechanism. Thus, an
enhancement to adaptive control has been done to
make it more robust against the presence of the
noise signal.
There are different parameters can be estimated
using adaptive control scheme. Several proposed
methods estimated only mass while the other
estimated mass and inertia matrix. Still, very few
researches took other parameters into accounts. In
mass estimating, a backstepping approach has
213
been done in [6] while an enhancement using
adaptive integral backstepping has been done
in [7]. An adaptive robust control has been used
in [8] and a model reference adaptive control is
done in [8]. However for mass and inertia
estimating, a comparison between model reference
adaptive control and model identification adaptive
control has presented in [10]. Moreover a
Lyapunov-based robust adaptive control has been
used in [11], [12] and [13]. And in [14], a
composite adaptive controller has been shown.
Furthermore, estimating extra parameters such as
aerodynamic coefficients can be shown first
in [15] using Lyapunov-based robust adaptive
control also in [16] using adaptive sliding mode
control and last in [17] using adaptive integral
backstepping method.
Quadrotor is considered as under actuated
subsystem; it has more degree of freedom than
number of actuators. Thus, the introduced control
method will divide the whole system to two sub
systems. The first subsystem will control the
internal dynamics using sliding mode control.
However, the second subsystem, which is a full
actuated system, will be controlled using a robust
adaptive control; to remove the effects of the
parameter uncertainty and reject the noise while
tracking the desired output.
This research paper is organized as follows. In
Section 2, the problem statement is defined. The
dynamic equation of the quadrotor model and the
reference frames are introduced in the next
section3. The proposed adaptive control scheme is
fully described in Section 4. Followed by The
enhancement on the adaptive law using robust
technique in Section 5. Finally, the validation of
the proposed control is done using Simulation in
last section.
2
PROBLEM DEFINITION
Quadrotor, like any other systems, suffers from

parametric uncertainty; either totally unknown or
vary with time. In addition, the presence of the
measurement noise will add an extra difficulty to
control the system. There are different ways to
control such systems such as; robust controller and
adaptive control. Robust control makes sure the
closed loop control system remain stable in the

presence of disturbance while adaptive control
deals with parametric uncertainty without having
any prior information about the parameter.
In this paper, a nonlinear control of a 6-DOF
quadrotor to follow a desired position, velocity,
acceleration and a heading angle despite the
parameter uncertainty and measurement noise is
aimed. The main plant parameters such as mass,
system inertia, thrust and drag factors are
considered as fully unknown and vary with time.
The quadrotor is considered as under-actuated
system. Thus, to control it in 6DOF, a dynamic
inversion technique to split the system into two
sub systems is been done.
3
QUADROTOR MODEL
Quadrotor has 6DOF and four actuators placed in

a cross configuration. Using a symmetrical design
of the quadrotor allows for a centralization of the
control systems and the payload. Each one of the
four rotors is connected to a propeller and all the
propellers axes of rotation are parallel to each
other. Also, all the propellers have fixed-pitch
blades and their airflow downwards to get an
upward lift. The left and the right propellers rotate
clock wise, while the front and the rear one rotate
counter-clockwise. Using an opposite pairs
directions will balance the quadrotor and remove
the need for a tail rotor. Consequently, the
movements of the quadrotor are directly related to
the propellers velocities.
3.1 Reference Frame:
Two main reference frames has been defined to
state the motion of a 6 DOF rigid body. The
frames are:
1) Earth inertial reference (E-frame)
2) Body-fixed reference (B-frame).
The earth-frame is defined as an inertial righthand reference and denoted by (oE, xE, yE, zE).
Using this frame, the linear position (
[m]) of
the quadrotor and the Euler angles (
[rad]) has
been defined.
214
(1)
(2)
[ ]
where x, y and z represent the position of the

center of the gravity of the quadrotor in E-frame.
While
represent the Euler angles in Eframe denoted by roll, pitch and yaw respectively.
The body-frame is considered as a right-hand
reference denoted by (oB, xB, yB, zB). And it is
attached to the body of the quadrotor. The torques
(B [Nm]) and the forces (FB [N]) has been defined
using this frame. See Figure 1.
[ ]
(3)
][
where the inputs are defined as:
][
]
(4)
Figure 1: Quadrotor Reference Frames
To map the orientation of a vector from Bframe to E-frame and vice versa, a rotation matrix
is needed and it is described as follows:
[
where
means
( ) and
means
( ).
3.2 Dynamic Equation:

Using the rotation matrix to map the forces and
the torques from the body frame to the earth frame
and by using Euler-Lagrange approach, the
dynamic equation of the quadrotor is driven and
described as following:
where
are the quadrotor inertia matrix
around
axis respectively in [Nms2],
is the quadrotor mass in [kg],
is the overall
-1
propellers speed in [rads ],
is the total
rotational moment of inertia around the propeller
axis in [Nms2], is the thrust factor in [Ns2],
is the drag factor in [Nms2], is the distance
between the center of the quadrotor and the center
of a propeller in [m], [
] are the inputs
of the quadrotor representing the collective force,
roll torque, pitch torque, yaw torque respectively
and
is the speed of the ith motor in [rads-1].
4
CONTROL SCHEME
Because of presence of the under-actuated

problem, reaching any desired set-point in space is
not possible for quadrotor. Thus to achieve
tracking
control
for
the
desired
command
, a dynamic inversion
215
method is been used to divide the system into two
subsystems [18].
The internal dynamics that yield from using the
feedback linearization are considered in the first
subsystem and it is given by:
(5)
A sliding mode control has been used to control

the first subsystems and to generate the command
signal
.
However, the second subsystem is considered
as a full actuated system and it is defined as:
[ ]
command
and
for the roll and pitch signals
respectively are been selected such that the
tracking control for
and
are been achieved.
Thus, the internal dynamics will be grantee to be
stable. The block diagram of the sliding control is
shown in Figure 3.
Figure 3: Sliding Control Block Diagram for the Internal

Dynamics
The states of the first subsystem are defined as

follow
(6)
(7)
]
An adaptive control is been developed to

control the second subsystem and to overcome the
unknown parameters. In addition, the adaptive
control is designed to achieve attitude and the
altitude control of the quadrotor
.
Furthermore, the adaptive control has been
improved by using a robust techniques. Those
Improvements have been introduced to improve
the parameters estimation and noise rejection. The
below block diagram in Figure 2 shows the overall
control scheme.
Let us rewrite internal dynamics equation in (5)

as:
]
(8)
where the yaw angle ( ) and the roll angle ( ) in

the matrix are the current angles of the system.
It is important to note that the matrix
is
invertible in the region of interest
(
)
(9)
Figure 2: Control Scheme Block Diagram
4.1 Sliding Mode

Dynamics:
Control
for
Internal
To guarantee the stability of dynamic inversion

technique, it is essential to stabilizing the internal
dynamics of the system. As a result, a proper
4.1.1 Reference Model:

The desired trajectory for the first subsystem is
been defined using the following reference model
in state space form:
216
( )
(10)
(15)
( )
Thus the control law is defined in the form of:

Where
represent the command inputs of
the first subsystem and
represent
the desired system performance.
Let the definition of the tracking error on XYplane is calculated as:

+
(11)
* +
(12)
where and
are design positive constants and
represent a stable Hurwitz polynomial.
4.1.3 Control Law:
A relation between the input force ( ) and the
gravitational acceleration ( ) could be made by
using the dynamic equation (1) to simplify the
control law. From the dynamic equation of the zaxis:
( )
( )
where the constant

and
are design positive
value and represent a stable Hurwitz polynomial.
Substitute the control law in (5) yields:
And the associated sliding mode error is defined

as:
(16)
4.1.2 Tracking and Sliding Mode Error:
(17)
Therefore, this gives exponential convergence

for
which guarantees the convergence of XYplane tracking error ( ).
4.2 Composite Adaptive Controller
The Adaptive control of nonlinear systems has
been well studied in [19]-13[21]. The adaptive
control is been applied for the second subsystem
to guarantee the convergence of the attitude and
altitude tracking
. In other words, the
objective of using the adaptive control is to make
the output asymptotically tracks the desired output
( ) despite the presence of parametric
uncertainty.
The composite adaptive law uses both
information from the tracking error and the
prediction error to extract and estimate the
parameters [22]. The block diagram of the
composite adaptive control is shown in Figure 4.
(13)
4.3 Parameterization:
When the system moves in XY-plane, the
acceleration in z-axis will equal to zero:
( )
( )
(14)
Let us define the states for the second subsystem

as:
(18)
217

4.3.1 Reference model

Similar to the previous subsystem, let the
desired trajectory for the second subsystem been
defined using the following reference model in
state space form:
Figure 4: Composite Adaptive Control Block Diagram
(20)
By rewriting (6) in linear-in-parameter form

yields:
(19)
4.3.2 Tracking and Sliding Mode Error:
Let the tracking error for second subsystem be

calculated as:
where
represent the command
inputs of the second subsystem and
represent the desired system performance Also.
]
(21)
[ ]
The associated sliding error is defined as :

( )
[ ]
(22)
where
is a positive definite diagonal matrix
defined as:
[
(23)
218
And
are design positive
constant, which represent a stable Hurwitz
polynomial.
4.3.3 Prediction Error:
where is the parameter error defined by:
(24)
Still, the presence of immeasurable acceleration

in will prevent us from using this definition for
the prediction error ( ) directly for the parameter
estimation. Thus, to avoid the appearance of the
acceleration, a first order filter is used as follows:
(30)
The prediction error is assign as the difference

between the actual input and the estimated control
(31)
And by using the following simplification:
( )
( )
( )
( )
(32)
The substitution of the control law in (6) become:

*
(33)
(25)
* ( )
where
( )
Therefore,
converge exponentially to the region
defined by ( ).
is stable filter with
4.3.5 Adaptation Law:
Now, let us define a filtered prediction error ( ) as:
(26)
To design a suitable adaptation law, let us define

the Lyapunov function as:
(27)
where
is the filtered version of the actual input,
defined as:
(34)
where the matrix

( ) and are symmetric
positive definite. The derivative of the Lyapunov
function can be calculated as:
4.3.4 Control Law:
Let us define the control law as:
(35)
(28)
Using the equation (17) and (33) yields:
where:
(
(
)
)
(29)
where
are defined as design
positive control gains. By substituting the control
law in (6) yields:
(36)
By assuming constants or small time varying

parameters ( ):
(37)
219
Let us choose the composite adaptation law as:
(
)
(38)
( )
( )
(39)
By ysing Barbalatslemma
( )
( )
(40)
( )) (
)
( )(
( )
))
(41)
( )
From the equation above, it is easy to realize the

noise effects on the adaptation law. Where the first
term contains the parameter information and the
second term tends to average out. However, the
third term will cause parameter drifting which will
make the parameter estimation drift away from the
value of the true value.
5.1 Reducing Parameter Drift:
And since
are positive definite
diagonal matrices, the above expression
guaranteed global stability and global tracking
convergence of both the sliding control and
adaptive control systems.
5
Thus, if we toke the first part of the adaptation law

(38) it will become:
Thus, the derivative of the Lyapunov function (36)

becomes:
PARAMETER DRIFT
One of the major problems associated with

parametric uncertainties is parameter drift.
Parameter drift is mainly caused by measurement
noise. To explain the parameter drift problem, let
us consider a constant reference input. There is
insufficient parameter information contained by
constant reference input. Thus, the parameter
adaptation
mechanism
has
difficulty
distinguishing parameter information from noise.
Therefore, the parameter drift in the direction
along with the tracking error remain small. In
addition, the whole system could become unstable
when the estimated parameter drifts to a point
where the closed loop poles enter the right half
complex plane.
In presence of measurements noise signal
( ( )), the function and
defined in (19) and
(25) respectively could be rewritten as:
Reducing the parameter drift could be done by

replacing the functions which contain the noise
( ), by the signals from the desired model
which are independent of the noise. This
replacement must be done after the tracking error
( ) and the prediction error ( ) have converged
well and their values have reduced within certain
amount.
The adaptation law defined in (38) will be adjust
according the errors values. For tracking
error ( ):
( )
| |
(
)
{
( )
(
) | |
For prediction error ( ):
* ( )
( )
* ( )
( )
| |
+ | |
SIMULATION RESULTS
A Simulation platform is benn devloped using

Simulink software. The platform has been used to
validate the proposed control algorithm and to
illustrate the difference in behavior between the
the robust and unrobust composite adaptive
control. Figure 5 shows the used simulink
platform.
220
In the simulation, all the plant parameters are
considered to vary with time and they have been
generated using square wave function. Also, they
have a common duty cycle of 50% and a common
frequency of
. Still, each parameter has
got different amplitude and offset which are
shown in Table 1.
Table 3: Control Parameter
Parameter
Value
[
Table 1: Plant Parameters
Time
Varying
Parameter
Fixed
Parameter
Frequency
in Hz
Offset
Amplitude
1.5
0.01
0.0125
0.02
0.1
0.002
Value
1
0.005
0.005
0.01
0.09
0.001
( )
*
)
)
The command signals

are shown
in Table 2. They have been generated using square
wave function with a common duty cycle of 50%,
amplitude of one and zero offsets. However they
have different frequencies.
+
( )
( )
)
20
Table 2: Command Signal
Frequency in Hz
0.015
0.01
0.012
0.009
( )
0.03
Command
Offset
0
0
0
0
in Table 3, illistrate the control parameters, which

have been selected to satisfy the desired
performance. And Table 4 shows the parameters
estimation Mean Square Error (MSE), where
robust adaptive control has got lower MSE in all
parameters.
Uniformly distributed random signals, which

represent the noise, have been added to the states
of the system and generated by the Simulinks
blook. The noise has got zero mean value and
wight with different seeds. The noise
represents the error in the sensors measurment.
The next figures show the results of the simulation
test, where the robust composite adaptive control
has better noise rejection than the unrobust one.
The blue line , in the figures, represents the robust
composite adaptive control and red one represents
the non-robust one.
221
Figure 5: Simulation Platform
In Figure 6, the tracking error of the both scheme is illustrated. The robust control which has been
represented in the blue line has less tracking error.
Figure 6: Tracking Error
222
In Figure 8, the parameter estimation has been plotted. The black thick line represents the true parameter
estimation. The robust control has better ability to estimate the actual parameter.
Figure 7: Parameter Estimation
In Figure 8, it is easily shown that the robust control can reject the noise in effective way.
Figure 8: Parameters Error
223
Table 4: Parameter Estimation Mean Square Error (MSE)
Parameter
(
)
Robust
0.10981
0.00367
0.01143
22.8098
0.06901
2.51034
0.10981
Method
Non-robust
0.036170
2.539e-05
8.021e-05
0.271159
0.001622
0.094554
0.036170
CONCLUSION
A robust nonlinear composite adaptive control
algorithm is done for a 6-DOF quadrotor system.
The proposed controller forced the quadrotor to
follow a desired position, velocity, acceleration
and the heading angle despite the parameter
uncertainty and noisy signal.
The composite adaptive controller is driven
using the information from both the tracking
errors and the parameter errors. The composite
adaptive controller has been enhanced to reject the
noise using robust technique. The stability of the
closed-loop system is shown in the flight region of
interest. Also the comparisons between both
sachems are illustrated.
REFERENCES
[1] Puri, A. "A Survey of Unmanned Aerial
Vehicles (UAV) for Traffic Surveillance."
Technical Report, Tampa, 2005.
[2] Li, B., Mu, C. and Wu, B. "A Survey of Vision
Based
Autonomous
Aerial
Refueling
forUnmanned Aerial
Vehicles." Third
International Conference on Intelligent Control
and Information Processing. Dalian, China:
IEEE, 2012.
[3] Saggiani, G. M., and Teodorani."Rotary wing
UAV potential applications: an analytical
study through a matrix method", Aircraft
Engineering and Aerospace Technology, Vol.
76 Iss: 1, pp.6 14, 2004.
[4] Chowdhary, G., Sobers, M., Pravitra, C.,
Christmann, C., Wu, A., Hashimoto, H., Ong,
C., Alghatgi, R. and Johnson, E. Integrated

Guidance Navigation and Control for a Fully
Autonomous Indoor UAS. AIAA Guidance
Navigation and Control Conference, Portland,
OR, August, 2011.
[5] Stevens, B. L. and Lewis, F. L. "Aircraft
Control and Simulation. " Hoboken, New
Jersey: John Wiley & Sons, Inc., 2nd ed.,
2003.
[6] Huang, M., Xian, B., Diao, C., Yang, K. and
Feng Y., "Adaptive Tracking Control Of
Under-actuated Quadrotor Unmanned Aerial
Vehicles Via Backstepping", In Proc.
American Control Conference, Baltimore,
USA, 2010., pp. 2076.
[7] Fang, Z. and Gao, W. 2011, "Adaptive integral
backstepping control of a Micro-Quadrotor",
IEEE, pp. 910.
[8] Min, B.-C., Hong, J.-H., and Matson, E.,
Adaptive Robust Control (ARC) For An
Altitude Control Of A Quadrotor Type UAV
Carrying An Unknown Payloads, in 2011
11th International Conference on Control,
Automation and Systems (ICCAS), Oct. 2011,
pp. 1147 1151.
[9] Mohammadi,
M.;
Shahri,
A.M.,
"Decentralized adaptive stabilization control
for a quadrotor UAV," Robotics and
Mechatronics (ICRoM), 2013 First RSI/ISM
International Conference on , vol., no.,
pp.288,292, 13-15 Feb. 2013
[10] Schreier, M. 2012, "Modeling and adaptive
control of a quadrotor", IEEE, , pp. 383.
[11] Fernando, T., Chandiramani, J., Lee, T. &
Gutierrez, H. 2011, "Robust adaptive
geometric tracking controls on SO(3) with an
application to the attitude dynamics of a
quadrotor UAV", IEEE, , pp. 7380.
[12] Imran Rashid, M. and Akhtar, S. 2012,
"Adaptive control of a quadrotor with
unknown model parameters", IEEE, pp. 8.
[13] Diao, C., Xian, B., Yin, Q., Zeng, W., Li, H.
and Yang, Y., 2011, "A nonlinear adaptive
control approach for quadrotor UAVs", IEEE,
pp. 223.
[14] Dydek, Z.T., Annaswamy, A.M. and
Lavretsky, E. 2013, "Adaptive Control of
Quadrotor UAVs: A Design Trade Study With
224
Flight Evaluations", IEEE Transactions on
Control Systems Technology, vol. 21, no. 4,
pp. 1400-1406.
[15] Bialy, B.J., Klotz, J., Brink, K. & Dixon,
W.E. 2013, "Lyapunov-based robust adaptive
control of a quadrotor UAV in the presence of
modeling uncertainties", IEEE, , pp. 13.
[16] Bouadi, H., Simoes Cunha, S., Drouin, A. &
Mora-Camino, F. 2011, "Adaptive sliding
mode control for quadrotor attitude
stabilization and altitude tracking", IEEE, , pp.
449.
[17] Lee, D., Nataraj, C., Burg, T.C. and
Dawson, D.M. 2011, "Adaptive tracking
control of an underactuated aerial vehicle",
IEEE, , pp. 2326.
[18] Das, A., Lewis, F. L., and Subbarao, K.,
"Sliding Mode Approach to Control Quadrotor
Using Dynamic Inversion, Challenges and
Paradigms in Applied Robust Control", ISBN:
978-953-307-338-5,
InTech,
DOI:
10.5772/16599.
Available
from:
http://www.intechopen.com/books/challengesand-paradigms-in-applied-robustcontrol/sliding-mode-approach-to-controlquadrotor-using-dynamic-inversion
[19] Sastry, S.S. and Isidore, A., "Adaptive
Control of Linearizable Systems", I.E.E.E
Transactions Automation Control, 1989.
[20] Slotine, J.J.E., and Coetsee, J.A., "Adaptive
Sliding Controller Synthesis for Nonlinear
Systems", Int. J. Control, 43(4), 1986.
[21] Slotine, J -J E, and Weiping Li. Applied
Nonlinear Control. Englewood Cliffs, N.J.
Prentice Hall, 1990., 1990.
[22] Emran,
B.
"NONLINEAR
CONTROL
J.,
and
A.
Yesildirek.
COMPOSITE ADAPTIVE
FOR
QUADROTOR."
The
International Conference on Electrical and

Electronics Engineering, Clean Energy and
Green Computing. UAE: SDIWC, pp. 220231, 2013.
225
A New Orthogonal Cryptographic System for Database Security Based on

Cellular Automata and Hash Algorithm
Dr. Mohammad V. Malakooti1, Ebrahim Akhavan Bazofti2
1
Faculty and Head of Department of Computer Engineering, Islamic Azad University (IAU), UAE Branch, Dubai, UAE
2
Department of Computer Engineering, Islamic Azad University (IAU), UAE Branch, Dubai, UAE
1
malakooti@iau.ae , 2 e_bazofti@yahoo.com
Abstract- In this paper, we have developed a new

orthogonal cryptographic system for database security that
has used both Cellular automata and Hash Algorithm. Our
Algorithm
consists
of
two
different
parts;
Encryption/Decryption of database tables as well as the
generation of the Authentication Tag for the activation of
the attack alarm for the database tables while it is
unlocked and in protected mode but it has been accessed
by the illegal users.
Our proposed orthogonal cryptosystem is considered to be
symmetric algorithm and uses a common key for both
encryption and decryption processes as oppose to the
asymmetric one that requires two keys, private and public
keys. Since our transformation matrix is orthogonal, we
have used the property of orthogonal matrix to calculate
its inverse based on its matrix transpose rather than direct
matrix inversion to save the calculation time during the
decryption process.
We also have generated secret keys by applying the
internal rules of cellular automata on the Malakooti
Transform (M-T) to obtain the secret key matrix that can
be used to be multiplied with the matrix of ASCII code
obtained from the records of the database. To apply
another level of security on the resulting encrypted code,
the Hash values obtained from each record are multiplied
by the elements of the secret key matrix and the XOR
operation is performed on the resulting values and the
elements of the encrypted codes.
In addition, we also proposed a robust and fast algorithm
for the database security and authentication that
automatically and accurately will generate the Hash values
for the entire rows of the database tables to obtain a
unique Hash value for each table. This unique hash value
can be used to check the validity of the data inside the
database and guarantee the authentication of all
information in each database.
Our proposed method is capable of detecting any slight

change that might be occurs on the database while it is in
the protected mode. The generated Hash value will be
calculated from the records elements of the database
periodically to be compared with the value of the Hash
value stored outside database for the authentication.
Should the generated Hash value be different from the
stored Hash value, the alarm flag would be activated to
inform the administrator about unauthorized change of
database while in protected mode via SMS or Email.
Keywords: Cryptosystem, Authentication, database, Tag,
Malakooti Transform, Hash Value, Cellular Automata,
Decryption, Encryption.
I.INTRODUCTION
Database security is the most important issue of cloud
computing, information sharing and processing where
many users are assigned to share the same
information stored on the distributed networks or
database servers. Although there are many security
checks and authentication techniques to identify the
users before they are allowed to access the data
centers or database servers but yet the unauthorized
users and hackers can always find a way to bypass all
these security checks and illegally access the
database. The multilayer encryption is one the best
ways to secure the contents of database table and
prevent the access of the illegal users from the
original data even they have reached to the core of
data centers.
They are several different encryption techniques that
have been used for database security but none of
them have used three levels of security as we applied.
In addition we have applied another algorithm called
226
Malakooti-Bazofti Hash algorithm to generate a
unique Tag or Hash pattern for the entire database
content to be compared with previous stored Tag. If
the new Tag is different from the previous Tag, then
the system will activate the Attack Alarm and
informs the administrator via SMS or Email.
The idea of securing databases using cryptographic
system is not new and several researchers have
proposed new techniques for the database encryption.
Gudes, Koch and Stahl [1] have presented anew
method for database encryption based on substitution,
transposition,
Reduction and expansion of data items but it
preserves data structures. Davida, Wells and Kam [2]
have proposed an encryption technique based on the
Chinese Remainder Theorem in which some
algebraic operation are performed on the content of
database and each records are encrypted by applying
some mathematical operations. The encrypted data
also can be decrypted by the inverse operation using
Chinese remainder theorem. Several other researchers
including [3-6] have used Data Encryption Standard
(DES) and RSA algorithm to perform the encryption
and decryption based on the public and private keys
as well as performed the authentication based on the
checksums of the data elements and the checksum are
obtained from the identifiers and the database key.
Our proposed method for encrypting the database
contents is based on the Malakooti Orthogonal
Transform in which its orthogonal property can be
used to invert the Transformation matrix by Matrix
Transpose operation instead of direct calculation of
the matrix inversion, during the decryption process.
To increase the level of security on the database
contents we have obtained the matrix of secret
byesby applying the Cellular Automata on the
elements of the M-T matrix. Once the secrete key
matrix is calculated it will be multiplied with the
matrix of coded data elements and finally XOR
operation is applied on the resulting elements and the
Hash values derived from Hash function applied on
the corresponding database records.
In addition to applying three levels of security on
database based on the M-T transform, Cellular
Automata, and Hash Algorithm, we have generated a
Unique Hash pattern or Authentication Tag based on
Malakooti-Bazofti (MB) Hash Algorithm, in which
the generated Tag can be compared with the stored

Tag and Any difference between these two Tags will
activate the attack alarm to inform the administrator
via SMS or Email. The proposed Attack Alarm Tag
will make our cryptographic system unique, robust,
reliable, and fully alert in which can be applied for
the highly secure database used over the distributed
system, cloud computing environment, and internet.
II. MALAKOOTI TRANSFORM ALGORITHM
The Malakooti Transform, M-T, is an orthogonal
transform similar to the Hadamard Transform that
was developed by Mohammad V. Malakooti in 1987
for the data compression, encryption, and
watermarking. Once this transform is applied on the
data matrix the resulting coefficients contained useful
information about the spectral characteristic of the
underlying data matrix and can be used for data
transmission, encryption and compression. Many of
the databases elements are highly redundant and this
transform can be applied to reduce the redundancy
and increase the data storage capacity. The optimal
selection of M-T coefficients can be used to
reconstruct or to represent the desired database
elements with less coefficients and resulting in a
saving of transmission speed and memory [11].
III. GENERATION OF M-T MATRIX
Assume that the initial value of the first order M-T
matrix, , is equal to one, thus
1,
(3.1)
and the elements of the second order M-T matrix,
, is formed according to following equation:
[
[
]
],
(3.2)
Where a and b are two constant parameters to

change the content of M-T Matrices.
The matrix M is a 2 x 2 anti-symmetric unitary
matrix
(3.3)
Where the matrix I is a 2 x 2 identity matrix and
constant parameters c is equal to the determinant
of . Thus,
(
)
c
(3.4)
227
Thus,
inverse is given as
(3.5)
Similarly, the fourth order M-T matrix,
can be
obtained according to (3.6)
[
]
(3.6)
The matrix
matrix
is a 4 x 4 anti-symmetric unitary
=C2 I,
(3.7)
where the matrix I is an 4 x 4 identity matrix, c is
given in (3-4), and the inverse of M2 is calculated
according to
M 2 1
M t2
C
(3.8)
Without loss of generality, the

can be obtained from
[
And
M-T matrix,
t
Mk
(3.10)
Ck
Using the Kronecker product notation

[
(3.11)
Thus, the M-T matrices can written accordingly,

[
(3.12)
And
(
=(
)
( )
( )
( )
(
.
(3.15)
We can easy generate the elements of the M-T
matrices by assuming that a=1, b=2, and expand the
idea to get
M-T matrices of size 2, 4, 8, 16, recursively as
following:
1 2
M1
2 1
1
2
M2
2
(3-16)
1
4
2
4
1
2
4
2
2
(3.17)
(3.9)
inverse is given according to (3.10)

M k 1
=
=
(3.13)
Where ( )is the Kronecker power 2 of

and the
symbol
denotes the Kronecker product.
Similarly,
=
( )
=
( )
=
= ()
(3.14)
.
.
.
.
The generation of any size M-T matrix can be

obtained easily by a recursive equation and it can be
multiplied by a data matrix of the same size to obtain
the encrypted database with a high speed and
accuracy due to its orthogonal property. In addition,
we can take advantage the orthogonal property of MT transform matrix and can calculate its inverse by
using orthogonal property of the M-T matrix and
calculate the inverse by using its matrix transpose
rather than direct calculation of the inverse matrix.
One can easily see that M-T matrix has special
features that can be used to encrypt the content of
database tables with low calculation cost and high
accuracy. The process of decryption is also similar to
encryption but the inverse of transformation matrix
can be obtained via matrix transpose rather than
direct inverse calculation. This high speed inverse
transformation can decrease the calculation cost ad
increase the speed of decryption process[12].
IV. CELLULAR AUTOMATA BASICS
A Cellular Automaton (CA) is a discrete model
consists of a regular grid of cells that each cell has
finite number of states but usually they holds two
states of on and off. The grids are usually considered
to be one dimensional or two dimensional but higher
order dimension grids also are also possible. The
application of CA is in several filed of science and
technology including mathematics, physics, biology,
228
and other branches of sciences. In two dimensional
CA grids each cell has a few neighborhood cells. If
the neighboring cells are located at right side, left
side, top, or bottom of the specified cell, they are
called Von Neumann Neighborhood in the honor of
Von Neumann who worked with Stanislaw Ulam at
Los Alamos National Laboratory, New Mexico,
USA, 1940.
An initial state of each cell at time t=0 is givens but
the new state of each cell at other times, t>0, is
calculated based on current state of the cell and the
sates of cells in its neighborhood. The mathematical
rules for updating the state of all cells are the same
and it will not be changed over the time [7].
Given the rule, anyone can easily calculate future
states, but it appears to be very difficult to calculate
previous states. However, the designer of the rule can
create it in such a way as to be able to easily invert it.
Therefore, we can say that it is Cleary a trapdoor
function, and can be used as a public-key
cryptosystem. The security of such systems is not
currently known [8].
The idea of Cellular Automata is intuitive and simple,
and it consists of a regular grid of cells. Each of
which may be in a predetermined number of states.
Cell ai+1 with the following rule:
Cell ai+1=Cell ai-1 + Cell ai
(4.1)
Suppose that we have the string of 11010010 and we
want to use the above rule such as our cellular
automata, then the Generated string will be
10101011. Table-1 shows the Generated string of
internal cellular automata rule [9].
Table 1: An Example of Cellular Automata
Input string
Internal cellular rule
Generated string
1 1 0 1 0 0 1 0
1 0 1 0 1 0 1 1
V. PROPOSED METHOD
In addition to our suggested method for encrypting
the database using Malakooti Orthogonal Transform
that applied on the matrix of ASCII values
representing the elements of each record in the
database, the idea of Cellular Automata also is used.
We have used the mathematical rules of Cellular

Automat and applied on the elements of MT matrix to
generate the elements of secret key matrix, Kt, that is
required to multiply by each row of coded matrix,
Mc, to obtain the elements of the encrypted data
matrix, Me, that represents the encrypted version of
the corresponding data table.
We also have proposed an entirely fast, secure, and
irreversible Hash algorithm to obtain the Hash
streams of the each database stored on the server. We
have called this method as the Log2 Algorithm,
because the log2 of number of records in the database
is calculated to divide the entire records in to N
different groups. Once the number of rows or records
in each group is obtained, then our proposed Hash
Algorithm based on the consecutive XOR and NOR
operations at each stage is calculated.
This approach is repeated so that hashed keys, rows
and columns and every database table could be turned
into a set of characters so called the Authentication
Tag. This Tag can be saved on a safe place that is
totally different than places where the databases are
stored and it will be used for the database protection
and authentication.
VI.HASHING TECHNQUES
The databases are the most valuable resources that are
stored on the servers that can be accessed and shared
by several clients with different level of privilege,
authentication and security. Thus, the confidentiality,
protection and maintenance of these valuable
resources are highly recommended. Thus, we have
proposed and applied several levels of security on the
database to obtain the required security on the
database servers.
One of the effective technique to protect the database
and prevent it from the unauthorized access and
illegal manipulation of its contents by the hackers and
attackers is to apply the fast, efficient, and robust
encryption algorithm to apply several levels of
security on the databases and finally use the hash
algorithm to obtain the fixed size hash values and
store them on severs or send them through distributed
network to other servers. Our proposed methods are
based on Encryption Algorithm, Cellular Automata,
229
hash function operation and finally calculation of the
Hash Value to generate the Authentication Tag.
Hash Value Generation Method

Due to the significance of data accuracy and
originality of the available information in each
database table, we have proposed the Hash Value,
HV, generation algorithm to be applied for each table
of the database. This hash value algorithm provide
the final hash value as the authentication Tag that is
able to detect any unauthorized access and database
manipulation while database is unlocked and in
protected mode. The new generated hash value will
be compared with the old hash value that is stored on
the safe location. Once any slight different between
these two hashes are reported, the attack alarm flag
will be activated and the network administration will
be informed via SMS or Email.
Hash Algorithms
1- Calculate the ASCII value of each record and
save it into data matrix, Ma.
2- Count the number of records in each database
tables, N.
3- Compute the logarithm base 2 of N to get M.
4- Divide the records of each table into M
sections.
5- Perform the XOR & NOR operations on the
selected records in each group to get level -1
hash operation.
6- Replace the value of N with M, N=M.
7- Perform the operations of steps 3-6 to
generate the level- 2 of hash operation.
8- Repeat the operation of steps 3-6 to change all
records in to one hash value, Hv, required for
the authentication.
Figure 1: Calculation of the Hash Value for Authentication
The Log2 Algorithm for calculation the Hash

Value
Input is a11, a12, ,an
For I= 2: log2n+1
T= n/2i-2
S= n/2i-1
IF T is even Then
For j=1
IF I is even Then
aij= ai-1,j XOR ai-1,(n-(j-1))
Else
aij= ai-1,jNORai-1,(n-(j-1))
End
Else T is odd
For j=1
IF I is even
aij= ai-1,jXOR ai-1,(n-(j-1))
ais= ai-1,s
Else
aij= ai-1,j NOR ai-1,(n-(j-1))
End
End
End.
230
{
VII. KEY GENERATION ALGORITHM

The Generation of secret key matrix is based on the
Malakooti-Bazofti(M-B) Algorithm as following:
1- Set the size of M-T Input matrix equal to the
number of records in the database, i.e, N = 4,
and generate the M-T matrix as following:
2
2
4
1
2
1 4 2
M2
2 4
1
2
4 2 2 1
2- Specify the Rule of Cellular Automata for
Key generation algorithm, For example , we
used this rules for cellular automata:
(7-1)
3- Apply the internal rule of cellular automata on

all elements of M-T matrix to get secret key
matrix.
3
4
6
1
2 1 3 2
Kt
2 6 3 3
2 4 1
4
4- Compute the determinant of the key matrix to
make sure that key matrix is invertible. Its
inversion is required for the decryption
process.
5- Generate a new key matrix if the determinant
of the key matrix is equal to zero.
The inverse of matrix Kt must exist otherwise the
decryption process cannot be performed. One can
easily show that possibility of the determinant of Kt
to be zero is very low because the rows of database
table have different value and chance of the
determinant to be zero is very low but it is not
impossible.
The Source Code of key Generation Algorithm
int u = 0,g;
publicvoidKey_Gen_Cellula()
{
if (u == 0)
for (inti = 0; i<= 7; i++)

{
for (int j = 0; j <= 7; j++)
{
g = j;
for (int k = j + 1; k <= j + 1; k++)
{
if (k == 8) { }
if (j == 0)
{
Matrix_KeyGen_Cellula_Data[i, j] = M_Transfer[i, j] + 1;
}
elseif (j == 7) Matrix_KeyGen_Cellula_Data[i, j] = 1;
else
Matrix_KeyGen_Cellula_Data[i, j] = M_Transfer[i, g - 1] +
M_Transfer[i, k];
}
}
}
for (inti = 0; i<= 7; i++)
{
for (int j = 0; j <= 7; j++)
{
Matrix_KeyGen_Cellula_Copy_Data[I,,j]
Matrix_KeyGen_Cellula_Data[i, j];
}
}
}
else
{
for (inti = 0; i<= 7; i++)
{
for (int j = 0; j <= 7; j++)
{
Matrix_KeyGen_Cellula_Data[i,j]
Matrix_KeyGen_Cellula_Copy_Data[i, j];
}
}
for (inti = 0; i<= 7; i++)
{
for (int j = 0; j <= 7; j++)
{
g = j;
for (int k = j + 1; k <= j + 1; k++)
{
if (k == 8) { }
231
if (j == 0)
{
Matrix_KeyGen_Cellula_Copy_Data[i, j] + 1;
}
elseif (j == 7) Matrix_KeyGen_Cellula_Data[i, j] = 1;
else
Matrix_KeyGen_Cellula_Copy_Data[i,g-1]
Matrix_KeyGen_Cellula_Copy_Data[i, k];
}
}
}
for (inti = 0; i<= 7; i++)
{
for (int j = 0; j <= 7; j++)
{
Matrix_KeyGen_Cellula_Copy_Data[i,j]
Matrix_KeyGen_Cellula_Data[i, j];
=
+
}
}
}
u++;
}
VIII. DATABASE ENCRYPTION ALGORITHM

The encryption algorithm for the records of the
database tables is given as following:
1- Read the entire table of database or just read some
important column of database tables (fields of
database table).
2- Calculate the ASCII code of the records and save
them into one matrix.
3- Insert the ASCII code of all records into a matrix
called ASCII Code matrix, Ma.
Table 2: The Content of original database table
57
53
32
32
Ma
101
110
0
0
0
0
0
0 114 0 115 0 111 0
0 32 0 50 0 48 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0
0
0
0
48
56
66
65
0 49
0 48
0 79
0 110
0 49
0 55
0 66
0 100
We have obtained an 8*8 Matrix for three rows of

the database table.
4- Generate the elements of M-T matrix
according to the size of the ASCII matrix.
2
2
4
2
4
4 8
1
2 1 4 2 4 2 8 4
2 4 1
2 4 8 2 4
4 2 2 1
8 4 4 2
M3
2 4 4 8 1
2
2 4
4 2 8 4 2 1 4 2
4
8 2 4 2 4 1 2
4 2 4 2 2 1
8 4
5- Multiply the M-T matrix with ASCII code

matrix.
Cd = MT *Ma
(8-1)
6- Set the size of M-T Input matrix equal to the
number of records in the database.
7- Generate the secret key matrix by applying
the rule of Cellular Automata.
10
22
17
34
17
4
1 14
9
34 17 31
1
4 11 9 18 6
7
14
0
18
9 15
Kt
1
9 31 9 18
4
24 10 18
9 5
7
7
4
5
7 14 5
7 13
5 2 7 14
MC = Kt* Cd
1
1
1
1
1 1
1 1
9 1
4 1
21
11
9
4
(8-2)
232
8- Multiply the Hash Value of each row obtained
from Hash function, Hf, into the
corresponding row of the secret key matrix,
Kt, to get complex secret key values. Apply
these keys on each row of Mc Matrix to obtain
the Encrypted data matrix, Me, that represents
the encrypted version of the corresponding
data table.
Me= MCXOR(Hf*Kt)
(8-3)
4- Apply the inverse of Kt on the MC matrix to obtain

the matrix of coded data.
Cd = (Kt)-1 * MC=
= (Kt)-1 *Kt * Cd
4944
5835
9981
13197
Mc
13302
1098
0
(9-2)
9183 6050
111
148
4830 3607 813
11550 9045 1980 2640 4770 3440 660
17607 12285 5559 7412 3780 3323 243
24894 18285 13563 18084 2100 1529 411

24429 17160 3633 4844 9900 7661 999
1746 2210 6708 8944 6150 4264 2376

0
0
0
0
0
0
0
0
0
0
0
0
0
0
5- Apply the inverse of the MT on the Cd matrix to

obtain the matrix of ASCII code.
Ma = (MT)-1* Cd
Figure 2:Final Encryption process using the XOR
= (MT)-1 * MT *Ma
(9-3)
Table 3- The Hash Table
IX. DATABASE DECRYPTION ALGORITHM

The decryption algorithm for the records of the
database tables is given as following:
1- Read the content of the encrypted data matrix, Me,
that was transformed, encrypted, and Hashed
during the encryption process.
2- Apply the XOR operation on Me and (Hf*Kt) to
get the matrix of M-coding, MC.
MC=Me XOR (Hf*Kt)
= MC XOR(Hf*Kt) XOR (Hf*Kt)
(9-1)
3- The result of above process will transform the
encrypted, hashed matrix into the M-coding.
Figure 3: Final Calculation of M-coding

Table 4- Decrypted Table of the database
233
X.GENERATION OF AUTHENTICATION TAG
In this section we have generated an Authentication
Tag Value based on the Malakooti- Bazofti (M-B)
Algorithm, in which multiple level of XOR and
NOR operations are applied on each group of hash
values obtained from the database records. Thus, all
hash values that obtained from the records are
combined together to obtain a unique hash value
require for the database security and authentication.
We have proposed a robust and fast algorithm for the
database
security
and
authentication
that
automatically and accurately will generate the Hash
values for the entire rows of the database tables to
obtain a unique Hash value for each table. This
unique hash value can be used to check the validity of
the data inside the database and guarantee the
authentication of all information in each database.
Should any slight change is made on the database
while the database is in the protected mode, the
generated hash value would be totally different from
the stored hash value and the software system
automatically will activate the alarm flag to inform
the administrator about unauthorized change of
database via SMS or Email.
In our proposed method, we have divided each
database table into N different records and used the
concept of parallel algorithm to calculate the Hash
values of all records of each table in database as well
as to calculate the Hash values of all columns,
accurately and efficiently. Once the hash values are
calculated, the fast XOR and NOR operations are
applied on the generated Hash values to obtain a
unique hash values for the Authentication Tag.
Figure 4:Calculated HashValue for each Row
XI. CONCLUSION AND FUTURE WORK

The objective of this research is to apply multilevel of
security on database and protect the contents of the
database tables from the unauthorized users and
hackers who tried to access our database illegally and
perform the read, write, change operations on the
database tables. To obtain the above objectives we
have transformed the contents of the database records
into to ASCII code and then applied the M-T
transform on the ASCII code matrix to calculate the
matrix of coded data. The matrix of coded data also
will be multiplied by the matrix of secret keys that
obtained by applying the rule of cellular automata on
the elements of the M-T matrix to calculate the
matrix of M-coding. To increase additional level of
security on the encrypted data, the hash value of each
record of the database is calculate and then multiplied
by matrix of secret key to obtain the bit patterns that
can be used in XOR operation with the matrix of Mcoding to obtain the highly secure encrypted values
for the records of the database.
We have proposed a robust and fast algorithm for the
database
security
and
authentication
that
automatically and accurately will generate the Hash
values for the entire rows of the database tables to
obtain a unique Hash value for each table. This
unique hash value can be used to check the validity of
the data inside the database and guarantee the
authentication of all information in each database.
234
Should any slight change is made on the database
while the database is in the protected mode, the
generated Hash value would be totally different from
the stored hash value and the software system
automatically will activate the alarm flag to inform
the administrator about unauthorized change of
database via SMS or Email.
Our proposed algorithm applied three levels of
security on the database contents and guarantees the
security of database tables. It also use the orthogonal
property of M-T matrix to obtain its inverse using its
transpose rather than the direct inverse calculation,
required for the decryption process More work need
to done to obtain a fast algorithm to perform the
cellular automata for generating the matrix of secret
keys.
[5] L. M.Battenpublic key cryptography` application and

attacks, 2013, john wily & Sons.
[6] C. Peikariand S.Fogie, Sams, Wireless maximum security,
2002, Sams.ISBN 0-6723-2488-1.
[7] M. Thomas Cellular Automata, Nova Science, 2010,
ISBN 978-1-62100-148-5(eBook).
[8] T.Ceccherini- Silberstein, Michel Coornaert, Cellular
Automata and Group, Springer, 2010, ISBN 978-3-642-140334.
[9] J. L. Schiff, Cellular Automata, John Wiley & Sons, 2008,
ISBN 978-0-470-10879-0.
[10] A.Mousa, O. S.Faragallah, S. El-rabaie and E M Nigm
Security Analysis of Reverse Encryption Algorithm for
Databases, IJCA Journal, 66(14):19-27, March 2013. New
York, USA.
[11] S.Kulkarni, S. Urolagin, Databases and Database
Security Techniques, IJETAE,ISSN 22502459,Vol 2,Issue 11,
November 2012.
[12] M. V. Malakooti, M. Raeisi Nejad Dobuneh Developing a
Lossless Digital Encryption System for Multimedia Using
Orthogonal Transforms, Malaysia, 2011.
Figure 5: Calculated Hash Value for a Table
XII. REFERENCES
[1] E. Gudes, H.S. Koch and F.A. Stahl The application of
cryptography for data base security. Proceedings of AFIPS
National Computer Conference, 1976, pp. 97-107.
[2] G.1. Davida, D.L. Wells and J.B. Kam, A database
encryption system with subkeys, ACM Trans. on Database
System, Vol. 6, No.2, June 1981, pp. 312-328.
[3] A.Afyoni Database Security and Auditing, 2005,
Amazon.com.
[4] L.Bouganim, Y.Guo, Database Encryption, Le Chesnay,
France, 2009.
235
Designing and Implementing Bi-Lingual Mobile Dictionary to be used in Machine

Translation
Hassanin M. Al-Barhamtoshy
Fatimah M. Mujallid
Faculty of Computing and Information Technology

King Abdulaziz University (KAU)
Jeddah, Saudi Arabia
hassanin@kau.edu.sa
Computer Science Dept., Faculty of Computing

King Abdulaziz University (KAU)
Jeddah, Saudi Arabia
f.mujallid@gmail.com
ABSTRACT
This paper describes the multistage process for
building Arabic WordNet (ArWn) to be used in mobile
device. The goal of this paper is how to create corpus,
starting with selecting an annotation task, designing the
data with the annotation process, and finally evaluating
the results for a particular goal. Therefore, the paper
presents designing and implementing bi-lingual lexicon
to be used in machine translation and language
processing.
Consequently, the paper takes into consideration
language characteristics in both directions Arabic and
English. The proposed system is based on WordNet
lexical database with a semantic and commonsense
knowledge. The proposed dictionary will be
implemented for mobile devices, therefore; the cloud
computing will be used in this implementation.
Consequently, SQL Azure will be used to solve
scalability, and interoperability of mobile users and
other methods have been used for both Arabic and
English languages. So, the SQL Azure will be used as
the cloud database to solve both the scalability in the
data with scale terabytes of data to millions of mobile
users and the interoperability challenges. The system
dictionary is developed and tested in Android mobile
platform. Experimental results show that the proposed
system has two versions- at work; offline and online.
The online approach uses the mobiles computing in the
cloud system to reduce the storage complexity of the
mobile. Real time test will be used in order to evaluate
the system access and respond times to display results.
KEYWORDS
MT, dictionary, Arabic, NLP, lexical, and
commonsense.
1 INTRODUCTION
Machine Translation (MT) is an important area
of Natural Language Processing (NLP)
applications and technologies in this domain are
highly required. Machine Translation applications
translate source language text (SL) into target
language text (TL) [1] [2]. Multilingual chat

applications, emails translation, and real-time
translation of web sites are typical examples of
machine translation.
In
multilingual
applications,
machine
translation (MT) is an essential component, and it
is highly-demanded technology in its own right.
Multilingual chatting, talking translators, and realtime translation of emails and websites are some
examples of the modern commercial applications
of machine translation.
Typically, dictionaries have been used in
human translation, and have also been used for
dictionary-based machine translation.
The main challenges that machine translation
systems encounter can be divided into two
categories: missing words, translation variants,
and deciding on whether or not to translate a name
(or part of it).
Conventionally, semantic resources and
lexicons have been used as core components for
building different applications in NLP. Recently,
researchers and developers have been using lexical
databases in NLP applications [3] [4]. Semantic
resources can be performed from lexical database
within several domains. Morphology, syntactic
and semantic features are needed to drive lexical
items of individual lexical items. Bilingual and
multilingual dictionaries are lexical databases and
they are depending on the type of languages that
they are involved [5]. Semantic, commonsense
knowledges and more semantic information about
specific word can be produced from lexical
database. One of the most widely known
commonsense knowledge bases is WordNet1 2 [6]
[7].
1
Wikipedia lexical resource: http://en.wikipedia.org/wiki/Lexical_resource

What is WordNet? http://WordNet.princeton.edu/WordNet/
236
Arabic language is one of the most spoken
language in a group called semitic languages, 422
people around the world speak it which considered
to be one of most considered and distributed
language around the globe [8] [9] [10] [11] [12].
The Arabic language is ranked sixth of the most
ten impact languages, with an estimated 186
million native speakers. In 2010 [12] the number
of Arabic native speakers increased to 239 million
people and the ranked of Arabic in the list rose to
the fifth3. Arabic speakers are increasing and
Arabic language is expanding in the world,
therefore number of Arabic documents and articles
are increased. This shows the importance of the
Arabic Language in the world.
Currently, linguistic and lexical resources for
Arabic language are growing but still they are few,
especially efforts for mobile devices. However,
the last decade has known a number of attempts
aiming at offering electronic resources for the
Arabic NLP community. One of the attempts is
the Arabic WordNet [12] [13] [14] [15] [16]
project which the objective was to construct and
develop a freely available lexical database for
standard Arabic. Arabic WordNet has very low
coverage and limited words.
Nowadays, people use their mobile for many
purposes and most of the users have replaced
computers desktops and laptops with them. By
2012 there were about 6 billion mobile users in the
world3. This big number shows what the future
will be; mobile computing. There are successful
attempts to build English smart mobile dictionary
but there are reared in Arabic language. The need
for an Arabic lexical database mobile application
has led to the creation of mobile dictionary
system. This paper presents to design and
implement bilingual (Arabic-English) mobile
dictionary using WordNet as lexical database.
In this paper, key terminology and formulations
used throughout this paper will be introduced.
Section 2 gives an overview in all the relevant
areas most notably the related work upon this
work is founded. Section 3 describes the mobile
dictionary framework, so, the system architecture
will be presented and illustrated. In section 3, also,
3
http://newsfeed.time.com/2013/03/25/more-people-have-cellphones-than-toilets-u-n-study-shows/
the system database has been explained and the

system workflow is introduced. Section 4 will
discuss evaluation and system performance. We
also examine the evaluation procedure undertaken
in this paper, and the difficulties that arise with
non-standard evaluation methodologies that are
often used in the translation area. And last Section
gives the conclusion, and future works.
2 LITERATURE REVIEW
Many attempts have been done, to create a
dictionary based in WordNet in different
languages. The first attempt was Princeton
WordNet (PWN)4,5,6. The Princeton WordNet has
been developed in 1985; it is large lexical database
for English language. The words structure of the
PWN is located according to conceptual similarity
with other words; to represent semantic dictionary.
Therefore, the words that have the same meaning
are grouped together in a group called Synset and
the words are classified into four parts of speech
(POS): nouns, verbs, adjectives and adverbs.
Synsets are composed from semantic and lexical
relations.
After PWN appearance, many attempts have
been emerged to create WordNets for other
languages, Euro WordNet (EWN) was a step
towards multilingual WordNet [17] [18]. The first
release of the EWN was for Dutch, Spanish,
Italian, German, French, Czech and Estonian. The
structure for each language in EWN is like as
PWN. All the EWN languages are connected by
an inter-lingual- index (ILI) which connects the
Synsets that are the same in different languages.
Another project called Balkanian WordNet
(BalkaNet) has been created, followed EWN and
added more languages such as Bulgarian, Greek,
Rumanian, Serbian, and Turkish.
After that, Global WordNet Associations
(GWNA)5 [22] has been created in 2000; and
many other languages have been built such as
China, Hindi6 and Korean.
For Arabic language efforts, there is Arabic
WordNet (AWN) which is a multilingual lexical
Euro WordNet, (Wikipedia, the free encyclopedia),
http://en.wikipedia.org/wiki/EuroWordNet
5
The Global WordNet Association, (The Global WordNet Association),
http://www.globalWordNet.org/ Euro WordNet
6 Hindi WordNet: http://www.cfilt.iib.ac.in/WordNet/webhwn/
4
237
database and it is linked to PWN using ontology
inter-lingual mechanism. The structure of AWN
consists of four entity types: item, word, form and
link. An item has information about the synsets,
ontology classes and instances. A word has
information about word senses. A form represents
a root or is plural form derivation. A link is used
to connect two items, and also it connects a PWN
synset to an AWN synset. Another WordNet
created for Arabic is a master thesis written in
2010 [20]. This thesis presents easy to use Arabic
interface WordNet dictionary which is developed
as the way the EWN has been developed [21].
This is monolingual dictionary for Arabic
language and is not connected to EWN or PWN
although it is built following them [21].
All these previous studies were built to work on
desktop applications. However there are few
attempts to build lexical database on mobile
platforms based on lexical knowledge and
commonsense. One of these attempts is creating
WordNet mobile-base to work with PWN for the
Pocket PC platform (Windows Mobile), they
called it WordNetCE [22]. Also there is smart
phone version (WordNetCE-SP) [23] [24].
Another success attempts is the Dubsar project
[24] which is a simple web-based dictionary
application based on PWN. Dubsar is a work in
progress; it is available for free worldwide on the
iTunes App Store for many of mobile devices.
Also it is available in the Android Market for free
worldwide.
There are other non free dictionaries and
thesaurus based on PWN for mobile platform such
as English WordNet dictionary by Konstantin
Klyatskin7, Advanced English Dictionary and
Thesaurus by Mobile System Company8,
LinkedWord Dictionary & Thesaurus by Taisuke
Fujita and Blends by Leonel Martins9.
From this literature review, the authors can
observe that there are no attempts to create an
Arabic dictionary for mobile platforms by using
lexical database. So the goal is to conduct a
dictionary which is organized by meaning and has
7
8
http://filedir.com/company/konstantin-klyatskin/
http://appworld.blackberry.com/webstore/content/314/?countrycode=
SA&lang=en
9
https://itunes.apple.com/us/app/linkedword-dictionarythesaurus/id326103984?mt=8
commonsense, semantic and lexical relations and

form a network of meaningfully related terms and
concepts. Also it composed of most common and
concise English/Arabic words and corresponding
explanations and it has quick and dynamic search
and works offline and online.
3. FRAMEWORK FORMULATION
To enable consistent explanations of the
systems throughout this paper, we define a
framework for the proposed translation model and
the system that follow this model. The formulation
for the translation process, apply primarily to
generative transformation method of bilingual
translation corpus and evaluation applies to
generative and extractive translation approaches.
Therefore, a framework for translation model
will be defined in this section. Bilingual
dictionary, lexicon and corpus will be used to
generate and extract translation approaches. The
generative translation process uses two stages:
training and generative stages. The two stages
running on a bilingual corpus; BC = { (DS , DT) };
and the generation stage produces one or more
word WT for each source word WS, see Figure 1.
Figure 1. Translation Model Framework
The training stage of the proposed model is

composed from three sun-modules: alignment
between source and target, segmentation using
graphemes or phonemes (in case of speech); and
transformation rule to generate the model that
built in the bilingual corpus.
Statistical machine translation (SMT) is used in
alignment, such SMT model can be considered as
a function of faith-fullness to the source language,
and fluency in target language [2] [3]. The
fundamental model of the SMT is defined based
238
on faith fullness (translation model) and fluency
(language model) as the following:
P ( S , T ) = argmax T P ( S| T) P (T)
(1)
Where S and T represent the sentences (words) in

source and target languages; P ( S | T ) represents
translation model; and P(T) indicates target
language model. Therefore, we need a decoder
that, given the sentence (or word) S, produces the
most probable sentence (or word) T.
3.1. Alignment
The word alignment is important as a component
in machine translation, especially in Statistical
machine translation, and, it is defined as it is a
mapping between the words of pair sentences that
are a translation of each other. Also, alignments
can be one-to-one, one-tomany and many-tomany relations. However, it is possible to generate
multiple target variants for a word where some
translators may add extra vowels to make variants
easier to understand.
3.2 Transformation Rules
A transformation rule can be defines as S (T
, p); where S is the source word; T is the target
word; and p is the probability of translating S to T.
Consequently, for any S that contains n rules, so:
S ( Tk , pk ) such that pk = 1
(2)
Another transformation rule to represent model M

is defined as; the model M takes source word S
and outputs list of tuples with ( Tj , Pj ) as its
elements. So;
S (Tj , Pj )
(3)
Where; Tj represents tuple with jth rule of the
source words generated with jth highest probability
Pj.
3.3 Bilingual Corpus
A bilingual corpus BC is defined as
transformation pairs { ( DS , DT ) }, where Ds =
ws1, ws2, wsl; and DT = { WTk} and WTk = wt1,
wt2, wtm ; wsi is a word in the source language,
wtj is word in the target language. Such corpus
will be implemented as computerized resources.
3.4 Evaluation Measures

One of the evaluation measures for machine
translation is word accuracy. Other metrics are
also used in the literature of [3]. Such evaluation
schemes can be classified into two categories:
single-variant and multi-variant metrics.
3.4.1 Single Variant
Word accuracy is one of the standard- used to
measure evaluation of machine translation.
Therefore, word accuracy or transformation
accuracy (A) can be calculated as the following
formula:
(4)
The appropriate cut-off value depends on target
word(s) which can be equivalent to the source
word. Therefore, it is important that the word
generated list of the target is the most probable in
the corpus. In this case, a metric that counts the
number of translation variants (Tk) that appear in
the system-generated list, L might be appropriate.
3.4.2 Multi-Variant Metrics
The corpus can be created using multiple
translations, including multiple variants that can
be taken into account [2].
Uniform word accuracy (UWA) is based on
equally values all of the translation variants
provided for a source word. For example, consider
(S, T) to represent word-pair between source and
target, where T = {Tk} and |T| > 1. Therefore, any
of the Tk variants in T is successful for translation
system.
Majority word accuracy (MWA) is provided as
one translation is selected as valid value. The
selected valid value as preferred variant it must be
suggested by majority of human translators.
Weighted word accuracy (WWA) identifies a
weight to each of the translations based on the
number of times that they have been suggested
with a given weight.
239
The annotation process can be summarized in
terms of the MATTER cycle processes [4]: Model,
Annotate, Train, Test, Evaluate and Revise.
3.5. Matter Description
The annotation process can be summarized in
terms of the MATTER cycle processes [4]; Model,
Annotate, Train, Test, Evaluate and Revise. Figure
2 shows the MATTER development life cycle,
[31].
The proposed method of translation system uses

an extended Markov window. Such method takes
Arabic/English word and uses set of rules then
mapped it into English/Arabic target. An
alignment method may be used to assign
probabilities to set of mapping rules (training
stage). The translation model is based on an
Markov formula derived from P ( S , T ) = P(S)
P(T|S) as:T = argmaxT P(S) P(T|S)
Choi and et al [19] presented English-Korean
transliteration system based on pronunciation and
correspondence rules. In such system prefix and
postfix was used to separate English words of
Greek origin. Also, they designed English-Chinese
transliteration frame based model, and used a
direct model as explained. Look to the following
source-language equation:
T = argmaxT P(S|T) P(T), and
T = argmaxT P(T|S)
They also investigated the target language

model to the direct transformation equation as:
Figure 2: The MATTER Development Life Cycle
The development cycle provides theoretical

informed attributes derived from empirical
observations over the data. The model can be
described by: vocabulary of terms T, the relation
between these terms, R, and their interpretation, I.
Therefore, the model M can be described by:
M = < T, R, I >
(5)
3.6. Generative Translation

Generative translation is the process of
translating word or phrase from source language
to target language [3]. Many different generative
transliteration methods have been proposed in the
literature with associated methodologies and
languages supported [3]. Automatic transliteration
has been studied between English and Arabic [21].
A general diagram of generative translation is
shown in Figure 3. Generative-based methods
identify the source word S, and then employ the
translated evaluation algorithm (single or multi
variant) to generate the target word(s) T.
S(T,P)
Figure (3): A Graphical Representation Approach
(6)
(7)
T = argmaxT (P(T|S) P(T)
(8)
To build their underlying model [3], they

presented their model on 46,306 English-Chinese
extracted from Linguistic Data Consortium (LDC)
entity using word accuracy metrics.
As shown in figure (2), the number of steps in the
transformation process is reduced from two or
three to one. Such transformation is relying on
statistical information using HMM. The following
general formula will be used:
P(T) = p(t1)
(9)
Technologies based on NLP are becoming

increasingly widespread [18]. Therefore, mobile
phones and handled computers support predictive
text, lexicon and dictionary building, speech
processing and handwriting recognition. Machine
translation allows us to retrieve written in
language and read them in another language.
Consequently, language processing has come to
play a central in the multilingual information
society. For long time now, machine translation
(MT) has been the holy grail of language
240
understanding [5]. Today, practical translation
systems exist for specific domains and for
particular pairs of languages. According to that
natural language toolkit (NLKT) is published and
used to support such translation. Many of NLP
material are covered in more details [4] [5].
Consequently, simple translator can be made
using NLTK by employing source language (e.g.
English language) and target language (e.g. French
language) pairs, and then convert each to
dictionary.
There are many online language translation
APIs (e.g. provided by Google and Yahoo).
Using such APIs translation, we can translate text
in a source language to a target language. NLTK
comes with a simple interface for using it [6].
Therefore, the internet is required to access and
used in the translation function. Consequently, to
translate text, two things are needed to know:
1. The language of text or source language.
2. The language of want to translate or target
language.
4 MOBILE DICTIONARY FRAME WORK
4.1 Principles
The proposed dictionary is a cloud mobile
application for an English-English, English-Arabic
and Arabic-Arabic dictionaries. The first phase is
used to collect and download the data from online
English dictionary that is liked The Project
Gutenberg Etext of Websters Unabridged
Dictionary10, and it is used to create database file,
figure 4.
Figure (4): Dictionary Structure Layout
First the authors classified the dictionary by

creating a list of meaning expressions and
10
http://www.gutenberg.org/cache/epub/673/
classifying these meaning in order of their

concepts. To classify these expressions the authors
need to specify the concepts in the language and
define the relations between the words in each
concept. The most reasonable classification is the
one that are suggested by Hadel and Hassanin [20]
[21]. It composed of four main classes: abstracts,
entities, events and relations. There are subclasses
under each main class and under each subclass
may have other subclasses and so on.
Semantic and lexical relations present a suitable
way to organize huge amounts of lexical data in
ontologys, and other concepts in lexical
resources.
4.2 Computing of Mobile Dictionary
It is known that the size of dictionaries database
is large and that mobile device storage is small
and does not accommodate large amounts of data.
The solution for this problem is by using cloud
technology. Cloud computing is the use of
computing resources such as hardware and
software which are existing in a remote location
and access such resources and services over a
network. The cloud computing service could be
divided into three main categories infrastructure as
a service (IaaS), platform as a service (PaaS) and
software as a service (SaaS) [25] [26] [27].
There is another category that comes under the
three main previous categories, which this paper is
interested in; it is data as a service (DaaS). DaaS
[28] is a service that makes information and data
such as text, image, video and sound reachable for
clients through global network. DaaS has many
advantages including: reducing overall cost of data
delivery and maintenance, data integrity, privacy
is satisfied, ease of administration and
collaboration, compatibility among diverse
platforms and global accessibility. The cloud
technology DaaS is used to provide the mobile
database for English and Arabic WordNets.
By using cloud technology, the main logical
design structure that the mobile dictionary uses
will become five tier (layer) structures. The
proposed architecture is client/server framework
consisting of four layers; each is running on a
different platform or in different process space on
241
the same system. These layers do not have to be
physically on different locations on different
computers on a network, but could be logically
divided in layers of an application [28] [29]. In the
four tier structure there are three layers are hidden:
presentation layer, process management layer and
database management layer. Figure 5 illustrates
these four layers. Within the dictionary-scale
semantic processing, the cloud computing
services; Software as a Service (SaaS), Platform as
a Services (PaaS), Infrastructure as a Services
(IaaS) [29] and Data as a Services (DaaS)
supposed to be employed, as illustrated in figure 5.
The
SaaS
layer
introduces
software
applications, PaaS presents a host operating
system, cloud development tools, while, IaaS
delivers virtual machines or processors, supports
storage memory or auxiliary space and uses
network resources to be introduced to the clients.
Finally, DaaS includes large quantity of available
data in significant volumes (Peta bytes or more).
Such data may have online activities like social
media, mobile computing, scientific activities and
the collation of language sources (surveys, forms,
etc.).
Therefore, cloud clients can access any of the
previous web browsers or a thin client with the
ability to remotely access any services from the
cloud.
4.3 Arabic WordNet Database Design
Arabic WordNet is identical to the Standard
English WordNet (PWN and EWN) in structure.
Therefore, Arabic words will be organized into
four types of POS: nouns, verbs, adjectives and
adverbs. Each word is grouped with other words
that have the same meaning in a group called
Synset. Each Synset is organized under a concept,
and it is related to other synset with lexical or
semantic relations. Nouns and verbs are arranged
in structured way based on the hypernymy/
hyponymy relations. Adjectives are categorized in
groups consist of head and satellite synsets. Nearly
all head sysnets have one or more synsets that
have the same meaning these called satellite
synsets. Every adjective is organized based on
antonyms pairs. The antonym pairs are in the head
synsets of a group.
Figure 5. Proposed Cloud Service Layers.
The proposed database is too big for a mobile

device (a mobile application can hold only a
database with size 2MB). There are two methods
to work with the mobile database, first is locally
which is SQLite (offline) and the second uses SQL
Azure database (online). The two databases have
the same structure but they are different in the data
size that they hold. The SQLite database can only
hold a small part of the database and can be
accessed fast. The SQL Azure database has the
whole database and it can be accessed through the
internet [30].
4.4 Inter-Lingua in Mobile Dictionary
The proposed system architecture of this paper
is based on the interlingua approach in the
machine translation (MT). Such approach extracts
the meaning of the word from the source language
(SL) (English or Arabic) and then translates it in
the target language (TL) (English or Arabic). The
mechanism can be classified into three main
components Arabic language dependent, English
language dependent and language independent
(inter lingua) modules. Figure 6 explains the
proposed mechanism.
The system description includes:
Bi-lingual dependent modules one for English
and the other for Arabic WordNets.
Domain ontology language independent
module to map between Arabic and English
WordNets.
242
Figure 6. Mobile Dictionary Mechanism
The language dependent modules contain:

1) English language dependent module.
English WordNet: contains the language
vocabularies.
Lexical Database: this database is described
and illustrated in the Princeton WordNet
(PWN) [6] [7], which contain approximately
most of the English words with their meaning.
Relation rules: which consist of 16 relations
[30].
2) Arabic language dependent module.
Arabic WordNet: contains the language vocabularies.
The Arabic lexical database which contains
tables that the Arabic database need.
Arabic Relation rules: include 23 relations
types between the synsets: hypernymy,
hyponymy, antonym, cause, derived, derived
related from, entails, member meronym, part
meronym, subset meronym, attribute between
adjective and noun, participle, pertainym, see
also, similar, troponym, instance holonym,
subset holonym, part holonym, instance
hyponym, disharmonies, class member and
verb group [30].
The language independent module contains:
Domain ontology: concepts which are grouped
in topics by the same. The main goal of the
domain ontology is to present a common sense
for the most important concepts in all the
WordNets.
The Inter lingua independent (ILI): The goal
of the ILI is mapping between the two Synsets
of the Arabic and English WordNets.
4.5 Arabic Mobile WordNet (ArWn) Workflow

RESTful web service is used to send and
receive data between client and server. The data
can be sent and received as Java Script Object
Notation (JSON), XML or even as Text. The data
of the proposed dictionary is handled by JSON,
because it is compact and supported in most of the
world.
The RESTful Web services hosted in Windows
Azure, it will be used to solve both the
interoperability and the scalability in mobile
applications. Figure 7 shows the system workflow
using RESTful Web service with JSON data
format11. This workflow is used while taking into
consideration hypertext transfer protocol (HTTP),
so any client mobile application that supports this
protocol is capable to communicate with them;
i.e., the interoperability is satisfied. In another
direction, windows Azure support scalability to fit
any degree of demand of data without difficulty12.
Figure 7. Mobile Dictionary Workflow [30]
3.6 ArWn Implementation Scenario

Implementation steps are divided into four parts:
1. Create an account in the Windows Azure.
2. Build a Windows Azure Cloud Project.
3. Deploy the RESTful Web Service.
4. Build a bilingual mobile application (ArWn).
The WCF REST13 programming model which is
shown in figure 8 permits customization of URIs
for all procedures. The model is illustrated in the
following:
1. A message request contains an HTTP verb with
URL is send from mobile by using standard
HTTP.
11
http://www.slideshare.net/rmaclean/json-and-rest
http://shop.oreilly.com/product/9780596529260.do
13
http://msdn.microsoft.com/en-us/magazine/dd315413.aspx
12
243
2. The RESTful Web service receives the mobile
application message request and give a call and
pass $filter=synset_id as a parameter.
3. Windows SQL Azure database will return the
records that are equal to synset_id.
4. The returned data will be converted to JSON
format (automatically) and go back to the
mobile device.
5. The data will be available to the mobile
application.
5 EVALUATION
The proposed ArWn is made up of Arabic
words and related English words, so, the complete
synsets includes 5 parts of speech, nouns (6,438),
verbs (2,536), adjectives (456), adjective satellite
(158), and adverbs (110).
Three of most widely used mobile operating

systems are Apple iOS, Android and Windows
Phone. The authors decided to develop the
proposed dictionary in an Android platform and
Windows phone. Because according to Gartner14
and IDC15. Android is now the most popular and
the most used mobile operating system in the
world.
Figure 9. Screens Shoot of the Mobile Dictionary System.
Figure 8. Workflow for Mobile Application Requsting [30]
4.7 Mobile Interface Testing

The authors tried to make the interface of the
mobile dictionary user friendly and easy to use as
it shown in figure 9. In figure 9, the first screen of
mobile is appeared when the application is
lunched. The second screen shows all the senses
of the word cat. And the other two screens show
an English word lion and the equivalent Arabic
word .
14
http://www.gartner.com/newsroom/id/2335616
http://www.idc.com/ getdoc.jsp?containerId=
prUS23638712#.USKkKmcV-gM
5.1Performance
The response time is important to evaluate the
performance of mobile dictionary system. The
definition of response time is the duration that a
system or application takes to respond to the
client. To calculate such time in mobile, we need
to know: network bandwidth (speed), number of
users (clients), client processing time, server
processing time, and network latency time.
Therefore, the mobile dictionary system response
time can be defined using all the varieties above to
return the results to the user (client), as follows:
Time = T client + T network latency + T server
(10)
Where:
T network latency = Word meanings * N / Net Speed
(11)
N represents number of clients.
Real time testing of mobile dictionary is used in
order to evaluate the system access time and the
needed time to respond and show the results. The
testing was done by connecting to the Azure
cloud, using Wi-Fi connection with 2MB/S speed.
15
244
The service respond time is illustrated in table 1.
The respond time is in seconds.
Table 1. Mobile Dictionary Service Respond Time
Service
English
word details
Word
Time
Word
Time
Word
Time
Word
Time
Word
Time
Rodent
5.9049 s
Stimulant
3.8422 s
Bruise
4.3392 s
Man
5.4584 s
Cat
5.1174 s
SQL Azure (Online)
5.9066 s
5.1384 s
3.4754 s
3.1556 s
3.9299 s
3.0172 s
3.7155 s
4.9957 s
3.0083 s
4.3919 s
Rodent
0.5215
Stimulant
0.3156 s
Bruise
0.3123 s
Man
0.7523 s
Cat
0.3130 s
SQL Azure (Offline)
0.9635 s
0.8978 s
0.4134 s
0.4084 s
0.6646 s
0.5485 s
0.6333 s
0.6399 s
0.4660 s
0.5285 s
Word
Time
Word
Time
Word
Time
Word
Time
Word
Time
Equivalence
Arabic details
Arabic
word details
Equivalence
English
details
Rodent
5.2448 s
Stimulant
4.6711 s
Bruise
4.1127 s
Man
5.0872 s
Cat
5.7743 s
English 2 Arabic
Arabic 2 English
6
5
4
3
2
Figure 10. Bi-Lingual Online Translation
1
0
Figure 11. Bi-Lingual Offine Translation
The proposed ArWn for mobile can be evaluated

using semantic relation features. Therefore, this
evaluation can be done by linguistic expert. Table 2
illustrates such evaluation results.
Table 2. ArWn Evaluation Features
Semantic Relation
Rodent
0.5862 s
Stimulant
0.3038 s
Bruise
0.3079 s
Man
0.6863 s
Cat
0.4818 s
Figures 10 and 11 illustrate the differences

between online and offline performance.
8
Putting the database in SQL Azure (online) has

its advantages and disadvantages. It has been
noted from the charts above that extracting the
data from SQL Azure takes longer time than
extracting it locally from SQLite. Therefore
putting the database in cloud database helps to
solve the scalability and fixed storage problem in
mobile devices but it takes more time to connect
to the data.
Attribute
Cause
Class member:Category
Class member:Region
Class member:Usage
Pertainym
Substance holonym
Substance meronym
Member meronym
Member holonym
Part meronym
Part holonym
Hyponym
Instance hyponym
Entails
Antonym
Similar
Derived
See also
Verb group
Participle
Hypernym
Troponym
Disharmonies
Derived related form
Total
No of
Relation
13
11
10
6
3
12
11
11
21
21
6
6
138
6
6
6
5
26
6
5
3
37
2
2
2
375
Correct
Relation
11
9
8
4
2
8
8
8
15
19
4
4
99
4
4
5
4
20
5
5
2
31
2
2
2
285
Precision
Percentage
15.385
18.182
20.000
33.333
33.333
33.333
27.273
27.273
28.571
9.524
33.333
33.333
28.261
33.333
33.333
16.667
20.000
23.077
16.667
0.000
33.333
16.216
0.000
0.000
0.000
21.35 %
84.62
81.82
80.00
66.67
66.67
66.67
72.73
72.73
71.43
90.48
66.67
66.67
71.74
66.67
66.67
83.33
80.00
76.92
83.33
100.00
66.67
83.78
100.00
100.00
100.00
100 %
This evaluation illustrates the value of precession

varies from one relation to another due to limited
size of dictionary and due to Arabic and English
morphological features.
5.2 Cost Evaluation
The proposed mobile dictionary system
requires an internet connection to access Azure
cloud web server. The Wi-Fi connection is good
according to free availability at many locations,
especially users home. The testing proofs that the
proposed dictionary system displays good results
obtained when testing the application using Wi-Fi
connection.
245
6 CONCLUSIONS
This paper described building bilingual
dictionary with lexical and commonsense
database. Such dictionary used clouds technology
and services to store the proposed data of the
dictionary. Therefore, the authors proposed an
application for mobile devices with Android
operating system. This application is a dictionary
uses the WordNet as a lexical concept and
commonsense database. This dictionary is
bilingual from English language to Arabic
language and vice versa. The RESTful web
service of the Windows Azure have been used to
deal with the interoperability and data scaling on
the storage problem of mobile application.
Moreover, the results of this paper open a new
way of approaching for mobile computing in
cloud system, by using such technology for
reducing the complexity of mobile storage.
In the future, the authors plan to develop the
dictionary for other mobile operating system. Also
the authors intent to increase the Arabic language
coverage and add to the dictionary some advanced
features such as visuality to the Arabic WordNet
dictionary. Also, the proposed system can be
extended by adding special needs technology,
such as sign language, speech recognition and
speech synthesis to allow deaf and blind peoples
to communicate.
REFERENCES
1. Jurafsky, D. and Martin, J. Speech and Language
Processing: An Introduction to Natural Language
Processing, Computational Linguistics and Speech
Recognition. Prentice Hall, Englewood Cliffs, NJ, 2008.
2. Karimi, S. Machine transliteration of proper names
between English and Persian. Ph.D. dissertation, RMIT
University, Melbourne, 2008.
3. Karimi, S. Falk Scholer, F. and Turpin, A. Machine
Transliteration Survey. ACM Computing Surveys, Vol.
43, No. 3, Article 17, Publication date: April 2011.
4. Pustejovsky, J., and Stubbs, A., Natural Language
Annotation for Machine Learning, 1st Edition, O'Reilly
Publisher, Release Date: October 2012.
5. Bird, S., Klein, E. and Loper, E., Natural Language
Processing with Python, OReilly Media, 2009.
6. Nitin I. and Damerau, F., Handbook of Natural Language
Processing, (Second Edition), Chapman and Hall/CRC,
2010.
7. Perkins, J., Python Text Processing with NTK 2.0
Cookbook, Packt Publishing, Birmingham-Mumbai, 2010.
8. Liddy, E.: Natural Language Processing. In Encyclopedia

of Library and Inform. Sci. 2nd Ed. Marcel Decker, Inc.
2003, pp. 2126-2136.
9. Hutchins, W.: Machine Translation: A Brief History,
Concise history of the language sciences: from the
Sumerians to the cognitivists. Edited by E.F.K.Koerner
and R.E.Asher, Oxford: Pergamon Press, 1995, pp. 431445.
10. Tze, L.: Multilingual Lexicons for Machine Translation,
ACM, December pp.1416, 2009.
11. Dichy, J., Farghaly, A.: Roots & patterns vs. stems plus
grammar-lexis specifications: on what basis should a
multilingual lexical database centered on Arabic be
built?, In Proc. of the IXth Machine Translation Summit
in the Workshop on Machine Translation for Semitic
Languages: Issues and Approaches, New Orleans, USA,
Sept. 23, 2003.
12. Weber, G.: Top Languages: The Worlds 10 Most
Influential Languages. Language Today, Vol. 2, Dec.
1997.
13. Black, W., Elkateb, S., Rodriguez, H., Alkhalifa, M.,
Vossen, P., Pease, A., Fellbaum, C.: Introducing the
Arabic WordNet project, In: Proc. of the 3rd Global
WordNet Conf., Jeju Island, Korea, 2006, pp. 295-299.
14. Elkateb, S., Black, W., Rodriguez, H., Alkhalifa, M.,
Vossen, P., Pease, A., Fellbaum, C.: Building a WordNet
for Arabic, In: Proc. of The fifth International Conf. on
Language Resources and Evaluation; Genoa-Italy, 2006,
pp. 29-34.
15. Rodriguez, H., Farwell, D., Farreres, J., Bertran, M.,
Alkhalifa, M., Mart, M., Black, W., Elkateb, S., Kirk, J.,
Pease, A., Vossen, P., Fellbaum, C.: Arabic WordNet:
Current State and Future Extensions, In: Proc. of the
Fourth Global WordNet Conf., Szeged, Hungary. Jan.
22-25, 2008.
16. Rodrguez, H., Farwell, D., Farreres, J., Bertran, M.,
Alkhalifa, M., Mart, M.: Arabic WordNet: Semiautomatic Extensions using Bayesian Inference, In: Proc.
of the 6th Conf. on Language Resources and Evaluation
LREC-2008. Marrakech (Morocco), May 2008.
17. Vossen, P.: WordNet, EuroWordNet and Global
WordNet, Revue franaise de linguistique applique,
Vol. VII, pp. 27-38, 2002.
18. Vossen, P.: Introduction to EuroWordNet, Computer and
Humanities, Kluwer Academic Publishers, pp. 32(73-89),
1998.
19. Choi, K.: CoreNet: Chinese-Japanese-Korean WordNet
with shared semantic hierarchy, Published in Natural
Language Processing and Knowledge Engineering., In:
Proc. Int. Conf., Oct. 26-29, pp. 767 770, Beijing,
China, 2003.
20. Al-Ahmadi, H.: Building ArabicWordNet SemanticBased Dictionary, masters thesis, Computer Science
Dept., King Abdul Aziz Univ., Jeddah, SA, 2010.
21. Al-Barhamtoshy, H., Al-Jideebi, W.: Designing and
Implementing Arabic WordNet Semantic-Based, the 9th
Conference on Language Engineering, 23-24 December
2009, Cairo, Ain Shams University.
246
22. Far, R.: Mobile Computing Principles: Designing and
Developing Mobile Applications with UML and XML,
published by Cambridge Univ. Press, 2005, pp. 861.
23. Talukder, A., and Yavagal, R.: Mobile Computing:
Technology, Application & Service Creation, published
by the Tata McGraw Hill publishing company limited,
Jan 1, 2005, pp. 668.
24. Arokiamary, V.: Mobile Computing, published by
Technical Publications Pune, Jan 1, 2009, pp. 556.
25. Strowd, H., Lewis, G.: T-Check in System-of-Systems
Technologies: Cloud Computing, Software Engineering
Institute, Carnegie Mellon, Pittsburgh, Pennsylvania,
Technical
Note
CMU/SEI-2010-TN-009,
2010,
http://www.sei.cmu.edu/library/abstracts/reports/ 10tn009.cfm
26. Lewis, G.: Basics about Cloud Computing, Software

Engineering Institute, Carnegie Mellon Univ., 4500 Fifth
Avenue
Pittsburgh,
2010,
http://www.sei.cmu.edu/
library/abstracts/whitepapers/cloudcomputingbasics.cfm
27. Huth, A., Cebula, J.: The Basics of Cloud Computing,

Carnegie Mellon Univ., Produced for US-CERT, a
government organization, 2011.
28. The ABCs of DaaS- Enabling Data as a Service
Application Delivery, Business Intelligence, and
Compliance Reporting Revision: 19 September 2011,
Delphix Corp.
29. Sadis, F., Mapp, G., Loo, J., Aiash, M., Vinel, A.: On the
Investigation
of
Cloud-based
Mobile
Media
Environments with Service-Populating and QoS-aware
Mechanisms, IEEE Transactions on Multimedia, Issue
99, 2013.
30. Al-Barhamtoshy, H., and Mujallid F. Building Mobile
Dictionary System, The International Conference on
Digital Information Processing, E-Business and Cloud
Computing (DIPECC 2013), The society of Digital
Information and Wireless Communication (SDIWC),
October 23-25, 2013.
247
Group Decision Support System Based on Enhanced AHP for Tender Evaluation
Fadhilah Ahmad1 , M Yazid M Saman2, Fatma Susilawati Mohamad3, Zarina Mohamad4,
Wan Suryani Wan Awang5

1,3,4,5
Faculty of Informatics and Computing

University Sultan Zainal Abidin
Tembila Campus, Besut, 22200,Terengganu, Malaysia
E-mail: {fad,fatma,zarina,suryani}@unisza.edu.my,
2
Faculty of Science and Technology

Universiti Malaysia Terengganu(UMT), 21030 Mengabang Telipot, Kuala Terengganu, Malaysia
E-mail: yazid@umt.edu.my
ABSTRACT
Application of model base in group decision making
that makes up a Group Decision Support System
(GDSS) is of paramount importance.
Analytic
Hierarchy Process (AHP) is the multi-criteria decision
making (MCDM) that has been applied in GDSS. In
order to be effectively used in GDSS, AHP needs to be
customized so that it is more user friendly with ease of
used features. In this paper, we propose an enhanced
AHP model for GDSS tendering. The enhanced AHP
method used is the Guided Ranked AHP (GRAHP). It
is a technique where decision matrix tables are
automatically filled in based on ranked data. However,
the generated values in the decision matrix tables can
still be altered by following the guidelines which in
turn serve the purpose of improving the consistency of
the decision matrix table. This process is transparent to
Decision Makers because the degree of data
inconsistency is visible. A prototype system based on
tendering process has been developed to test the
GRAHP model in terms of its applicability and
robustness.
KEYWORDS
Group Decision Support System (GDSS), MultiCriteria Decision Making (MCDM), Analytic
Hierarchy Process (AHP), Tender Evaluation.
1 INTRODUCTION
Decision support system (DSS) is seen as building
blocks that offers the best combination of
computational power, value for money and
significantly offers efficiency in certain decision
making problem solving [1,25]. Based on these

building blocks, modern DSS applications
comprise of integrated resources working together
which are model base, database or knowledge
base, algorithms, user interface and control
mechanisms used to support certain decision
problem [2].
There are many application areas suitable for DSS
which include academic advising, water resource
planning, direct mailing decisions, e-sourcing,
tendering decisions and many more. DSS has a
vast field of research scopes which are categorized
as model management, design, multi-criteria
decision making (MCDM), implementation,
organization science, cognitive science, and group
DSS (GDSS). DSS also has direct relation with
Human Computer Interaction (HCI) and Database
Management System (DBMS).
MCDM constitutes an advanced field of research
[21-24] that is dedicated to the development and
implementation of DSS tools and methodologies
to handle complex decision problems involving
multiple criteria, goals or objectives of conflicting
nature. MCDM is broadly classified into two
categories which are Multiple Attribute Decision
Making (MADM) and Multiple Objective
Decision Making (MODM) [5]. MADM methods
are used for selecting single most preferred
alternative or short listing a limited number of
alternatives, while MODM methods are used for
designing a problem involving an infinite number
of alternatives implicitly defined by mathematical
248
constraints. Evaluation of a problem in DSS can
either be done by a single decision Maker (DM) or
a group of decision makers (DMs). If it involves a
single DM, the DSS is called Single DSS (SDSS)
and if a group of DMs are involved, the term
group DSS (GDSS) is used. GDSS comprises a
large body of research and it remains an active
area of investigation. A GDSS in web-based
environment is a computerized system that makes
use of model base and database/knowledge base
which delivers decision support information or
decision support tools to a group of DMs/users
using a web browser such as Netscape Navigator
or Internet Explorer [3].
There is a need to consider a group of DMs in
improving the productivity of decision making and
also the quality of decision results. [4] states that
groups have an advantage in combining talents
and providing innovative solutions to possibly
unfamiliar problems. This is because groups
possess a range of skills and knowledge compared
to individual DM. A well-known MCDM model
that has been used in GDSS is Analytic Hierarchy
Process (AHP) [26-28]. [18] used Group Analytic
Network Process (GANP) to support hazard
planning and emergency management under
incomplete information. They showed that both
AHP and GANP have great potential to be
deployed in specified case involving a group of
DMs. Group fuzzy prioritization processes for
AHP/ANP was also suggested to be used if the
nature of the problems is tentative, imprecise,
approximate and uncertain [11]. A group decision
approach for evaluating educational web sites
using several soft computing technologies e.g.
fuzzy theory, grey system and group decision
method has been proposed by [19]. A GDSS for
evaluation of tenders of ICT equipment based on
multi attribute group decision models and the
software WINGDSS was developed by 12]. The
winner of a tender would be the one who makes
the best offer after the prequalification process and
the ranking processes.
Even though many GDSS have been developed
using various model bases, none of them provides
flexibility to DMs in terms of the followings:
1) Giving guidelines on how to enter data into

AHP decision matrices, 2) freedom to choose
other enhanced AHP versions, and 3) transparency
in data consistency checking in just one generic
DSS.
Consequently, we have addressed all these issues
in a research as presented in this paper. A
tendering case study was employed to demonstrate
the issues of conflicting evaluation criteria in
decision making and a model were proposed to
solve the problem. Tendering problem has a finite
number of evaluation criteria that are experienced,
technical skills, previous work performance, and a
few others. In terms of alternatives, only limited
numbers of choices are taken into consideration
since some of the alternatives had already been
discarded in the pre-requisite analysis.
One simple and flexible MADM model used by
many scholars [6], [8], [9], [10] in appraisal
evaluation is the Analytic Hierarchy Process
(AHP). AHP [15] has many advantages such as
easy to use, well accepted by decision makers, can
be used in SDSS and GDSS, and has matured
through multiple revisions.
This paper is organized as follows. Section 2
outlines the Guided Ranked AHP (GRAHP)
model, which is an enhanced version of AHP.
Next, the implementation of the model in GDSS
tendering and the results of the implementation is
presented in Section 3. Finally, a summary of the
paper is accomplished in Section 4.
2 GDSS RELATED WORKS
There are various models that can be used in
GDSS such as AHP, Fuzzy AHP, Group Analytic
Network Process (GANP), Delphi, Maximized
Agreement Heuristic (MAH), TOPSIS, nominal
group technique (NGT), and a few others. Table
2.4 describes some of the research carried out on
GDSS using specific models. [18] used GANP to
support hazard planning and emergency
management under incomplete information. They
showed that GANP have great potential for use in
specified case involving a group of DMs. If the
problem
involves
tentative,
imprecise,
249
approximate and uncertainty, then the model base
suggested is group fuzzy prioritization processes
based on AHP or GANP.
Another GDSS approach has been studied by [11]
on journal evaluation. The method that has been
used is an integration of subjective (eg. experts
judgments on journals) and objective approach
(eg. Impact factors of journals). Fuzzy set
approach is used when dealing with imprecise or
missing information.
Gwo-Jen et al. (2004) have proposed a group
decision approach for evaluating educational web
sites. Several soft computing technologies have
been employed in the approach, such as fuzzy
theory, grey system and group decision method. A
computer-assisted website evaluation system,
EWSE (Educational Web Site Evaluator), has
been developed based on an experimental
approach. The system is capable of selecting the

proper criteria for an individual web site and
achieves greater accuracy when evaluating the
results.
There is a work on tender evaluation focusing on
the selection of supplier for ICT equipment
(Rapcsak et al., 2000). Two Multi attribute group
decision models known as criterion tree and
weight system have been used. The tender is
awarded to the one who makes the best offer. The
ranking of the offer are based on the price and
multitude of criteria. The tendering process
consists of two stages. The first round is the
prequalification process and followed by the final
ranking of alternatives, accomplished by the price
adjustment method. Arithmetic means technique
is used to aggregate individual results to form the
group result.
Table 1. Summarization of studies on GDSS using particular model base

Year
Authors
Model
2014
Kar
2014
Taylan et al.
2013
Srdjevic and
Srdjevic
AHP
2007
Levy and Taji
GANP
2007
Saaty and Shang
AHP
Hazard planning
and emergency
management
Voting
2006
Ratnasabaphthy
and Rameezdeen,
Shih, Huang and
Shyur
Statistical and Delphi
Procurement
AHP,
TOPSIS, Nominal Group
Technique (NGT),
Bordas function
ANP, Delphi, MAH
Recruitment and
selection
2005
Fuzzy AHP and

Fuzzy Goal Programming
Fuzzy AHP and
Fuzzy TOPSIS
Fields
2005
Kengpol and
Tuominen
2005
Limayem,
Banerjee and Ma
Adaptive Structured Theory

(AST), Faithfulness of
Appropriation (FOA)
2005
Turban, Zhou and

Ma
2004
Gwo-Jen, Tony,
Selection of
supplier
Selection of
construction
projects
Selection of
Wetland area
Issues Addressed
Use of Geometric Mean in Fuzzy AHP
Creating weight using Fuzzy AHP for
linguistic variable
AHP synthesis of the best local priority
vectors based on the most consistent
decision makers
GANP DSS that used quadratic
programming and interval information
to cope with incomplete information.
Preference intensity using cardinal
approach
several-issues-at-time
decision-making
Four rounds of Delphi surveys, several
statistical methods, and interviews
Enhancing consensus among DMs,
GDSS framework
Evaluation of
information
technology
GDSS process
enhancement
Achieving consensus in
and qualitative judgments
quantitative
Fuzzy set theory
Evaluation of
journals
Fuzzy theory, grey system,
Evaluation of
Integration of objective and subjective

judgements using fuzzy set approach to
deal with
imprecise and missing
information.
Open evaluation criteria, uncertainty and
Requirement of embedded decisional

guidelines, tailored training and
decisional guidelines
250
and Judy
Mikhailov
AHP
Fuzzy AHP
2000
Rapcsak, Sagi,
Toth and Ketszeri
Criterion tree, Weight

system, Voting power
vector, Software WINGS
1996
Tavana, Kennedy
and Joglekar
AHP, Delphi, Maximized

Agreement Heuristic
(MAH)
2002
educational web
Model base
enhancement
(AHP process
enhancement)
Evaluation of
tenders in
information
technology
Recruitment and
selection
incomplete information
Group prioritisation process using fuzzy
programming optimization to deal with
missing judgements.
Model building, GDSS system
development methodology, and methods
of aggregation the score by each DM.
Improving consistency among DMs.
3 ANALYTIC HIERARCHY PROCESS

There are some well-known numeric discrete
techniques of MADM models. They are Analytic
Hierarchy Process (AHP), Weighted Sum Model
(WSM) or sometimes it is called Additive Value
Function (AVF), Weighted Product Model
(WPM), Technique for Order Preference by
Similarity of the Ideal Solution (TOPSIS),
ELimination Et Choix Traduisant la REalit to
mean ELimination and Choice Expressing REality
(ELECTRE),
and
Preference
Ranking
Organization Method for Enrichment Evaluations
(PROMETHEE).
Comparing the models in order to choose the best
method for a particular problem is not an easy task
(Triantaphyllou, 2000; Zanakis et al., 1998). Each
of them has its own strengths and weaknesses. A
study made by Zankis et al. (1998) has concluded
that AHP appears to perform closest to WSM and
TOPSIS. PROMETHEE and ELECTRE behave
differently because these methods present different
ranking philosophy and do not assume that unique
ranking always exists in practice. The result of the
study made by Triantaphyllou (2000) has
recommended that for most of the cases, for
certain evaluation criteria, AHP appears to be the
best decision making method. However, based on
the literature, we found that the selection of the
model depends on the type of problem to be
solved and the nature of criteria used for the
evaluation of alternatives
AHP was introduced by Thomas L. Saaty in 1980

(Saaty, 1980). It is a multi-attribute decision
making methodology for choosing the best among
a set of alternatives via pair comparison process.
It uses numeric technique to help DMs choose
among discrete set of alternative decisions. The
AHP method is based on the following principles:
i. Build a hierarchy of criteria, by decomposing
the problem into a hierarchy tree. The left end
side of the tree represents the goal to be
achieved and the right end side represents the
alternatives among which to decide the
preferred one (Figure 1);
ii. Perform a sequence of pair-wise comparisons
for the criteria on the same level of hierarchy
for each node;
iii. Perform a sequence of pair-wise comparison
on the alternatives for each criteria;
iv. Establish weighting among the elements in the
hierarchy;
v. Synthesize the results in order to obtain the
overall ranking of alternatives with respect to
goal;
vi. Evaluate the consistency of judgment to make
sure that the original preference ratings are
consistent.
Table 2 shows the pair wise comparisons scheme
as proposed by Saaty. The scheme can be used to
translate linguistic judgment comparisons into
numbers, which are then inserted into the decision
matrix, A.
251
Criteria 1
Alternative A
Criteria 2
Sub-criteria
3.1
Sub-criteria
3.2
Criteria 3
Goal
.
.
Alternative
B
Alternative C
.
.
Sub-criteria
..
Sub-criteria
..
Criteria N
Alternative N
Figure 1. Hierarchical Diagram for AHP Approach

Table 2. Pairwise comparison scale
Intensity
Definition
1
3
5
7
9
2, 4, 6, 8
Equal importance of both elements

Slight importance of one element over another
Moderate importance of one element over another
Essential or strong importance of one element over another
Extremely importance of one element over another
Intermediate values between two adjacent judgments
The decision matrix, A for pair wise comparison in

AHP method is as follows:
A=
A1
A2
A3
An
A1
a12
a13
a1n
A2
1/a21
a23
a2n
A3
1/a31
1/a32
a3n
Am
1/am1
1/am2
1/am3
(1)
The decision matrix, A is an (m x n) matrix in

which element aij is a pair wise comparison
between alternative, i (row) and alternative, j
(column) when it is evaluated in terms of each
decision criterion. The diagonal is always 1 since
aij = 1 (since the criteria or alternatives are being
compared to themselves) and
the
lower
triangular matrix is filled using Equation 2.
(2)
AHP has gone through multiple revisions over the

years. Some works have been done on improving
the AHP itself in terms of decision matrix
consistency. Saaty (2003) has investigated the
quintessence of eigenvector principal in decisionmaking and its influence in the judgments of the
AHP decision matrix. Previous work by Harker
(1987) in this area is referred and evaluated
together with his work in terms of how to improve
the consistency of judgments, and transform an
inconsistent matrix to a near consistent one.
However, their works do not provide a step-bystep guideline to DMs on how to enter consistent
data into the matrix. Hence, a guideline is needed
to assist DMs to enter consistent or near consistent
data into the decision matrix.
4 GUIDED RANKED AHP
Guided Ranked Analytic Hierarchy Process
(GRAHP) model is a set of guidelines [20]
synthesized with Ranked AHP (RAHP). RAHP is
252
a decision analysis method introduced by Othman
[13] in 2004. This method can be used to fill in
AHP decision matrices based on the priority
ranking of each element value in each criteria or
sub criteria for each pairwise element comparison.
The elements are all ranked according to their
priority values in hierarchical form. In terms of
assigning the value to each comparison elements
in AHP decision matrices, the pairwise
comparison value can be obtained using the
following rules:
Assume that Pi (i = 1, 2, , 9) is a rank for i-th element,
i.
If Pi = Pj
ii.
If Pi < Pj
1/aij
then
aij = 1
then aij = (Pj - Pi
+ 1) and
This method adheres to the input range value

concept proposed by Saaty with the highest
priority element is marked as 1 while the lowest is
9. The maximum comparison value assigned to
the elements which are greater than 9 (10 and
above) is still 9.
RAHP was claimed to be better than the primitive
Saatys pairwise comparison method because it is
always consistent and easy to use just by following
the above guidelines. This reduces the time to get
the results because RAHP provides a formula on
how to enter data into the decision matrices.
On the other hand, the RAHP model proposed by
Othman lacks the process of converting the item
values in each criteria or sub-criteria for each
alternative to ranked values. As an enhancement,
we propose the conversion process as follows:
Sort the criteria value from the most to the least importance.
Assume that the most importance has bigger original value:
Initialise rank_value;
For ( there is record ){
if ( element [ i-th] > element [i-th + 1] ){
assign rank_value to the i-th element;
assign rank_value + 1 to the next element;
increment rank_value
}elseif (element [ i-th] = = element [i-th + 1]){
assign rank_value to the i-th element;
assign rank_value to the next element;
}Increment i}
Applying GRAHP causes AHP decision matrices

to be automatically filled in. However, DMs can
still alter these values based on their own
discretion or they can follow the guidelines given
to reduce data inconsistency. This process is
transparent to DMs where the degree of data
inconsistency is acknowledged.
In AHP, the group prioritization process was used.
Possible approaches to estimate the weight of
elements in AHP are; agreement of each group
member to enter the decision matrix table, voting
process, aggregation of individual evaluation via
geometric mean or arithmetic mean. These
approaches have their own challenges. In our case,
the arithmetic mean approach was chosen for
GRAHP owing to its simplicity.
5 IMPLEMENTATION OF GRAHP IN GDSS
TENDERING
The process flow of GDSS tender evaluation in
Malaysia is depicted in Fig. 2. At the beginning of
the
process,
a
DM
(the
company
management/group leader) defines the number of
contractors for a certain project and the group
features. Then, the DMs (either company
management/group leader or group members) rank
the criteria (Fig. 4, step 1) and assigns the scales
for the decision matrices (Fig. 4, step 2) through
the form interfaces. These data are then stored in a
hybrid database, weight and GRAHP model bases.
These are the initial input accepted by the GDSS
tendering after the client tendering evaluation
request. Guidelines regarding the decision
matrices input scales are displayed on top of the
matrices. This information can be used by the
DMs to define the degree of importance of the
strategic level evaluation criteria. These are the
distinguishing part of our model, since many AHP
applications did not provide such guidelines and
alert messages to ensure consistency input scales.
These outstanding features together with the
automatic fill in of table matrices enable the DMs
to focus on the evaluation of alternatives instead of
decision making problems themselves. GRAHP
operations are carried out automatically by the
GDSS tendering system. These operations include
the calculation of priority vector and the weight
253
[15] for each criteria and contractors. Arithmetic
means are used to combine judgment by individual
DM [16] into group preferences in order to
produce the final ranking. The ranking is displayed
in tabular and graphical forms (Fig. 4). All the
operations of the tendering process starting from
the initial input to the final ranking involve
accessing the database and various model bases
that include statistic, weight and GRAHP. This is
another unique feature of our model where data
about contractors are stored in the tendering

database, and the model base operation results
are stored in the specific model base repositories.
Hence, the properly structured data in a few types
of categories (database and model bases) will ease
the maintenance and programming aspects of it
compared to keeping data in an unstructured way
2.
6 SUMMARY
This paper has discussed the GRAHP model as an
enhancement of AHP in handling data
consistency. Our findings suggest that GRAHP
enable DMs to be more intuitive in their decision
making processes. Furthermore, GRAHP guides
the DMs in terms of selecting suitable input scales
for decision matrix tables. In terms of system
design and development, we have produced
flexible and user friendly interfaces. At the top of
each GAHP model form, there is a brief
description of AHP scales. DMs can easily select
the AHP scales using drop down menus provided
in the forms. There is also set of guidelines on
how to choose the scale values to reduce
inconsistency of data entry if DMs are not satisfied
with the calculated input values performed by the
system. The use of GRAHP approach simplifies
the evaluation process because most of the time
the DMs will not be bothered with inconsistent
data in the matrices. The DMs will be alerted with
warning messages if the problem still persists. The
degree of inconsistency of data is also displayed to
the DMs to enable the values in the decision
matrix tables to be re-adjusted in order to assist the
DMs with re-evaluation.
7 ACKNOWLEDGEMENT
The authors are very grateful to the Ministry of
Higher Education Malaysia and University Sultan
Zainal Abidin, Malaysia for the grants, and
support
1.
Kameshwaran S., and Narahari Y., 2003. eProcurement Using Goal Programming E-Commerce
and Web Technologies Proceedings, Lecture Notes in
Computer Science, Springer-Verlag, pp. 6-15.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
Barkhi, R., Rolland, E., Butler, J. and Fan, W. (2005).

Decision Support System Induced Guidance for Model
Formulation and Solution. Decision Support Systems
40(2): 269-281.
Power, D.J and Sharda, R. 2005. Model-Driven
Decision Support Systems: Concepts and Research
Direction. Decision Support Systems.
Elfvengren, K., Krkkinen, H., Torkkeli, M. &
Tuominen, M. 2002. A GDSS Based Approach for the
Assessment of Customers Needs in Industrial Markets.
Proceedings of 12th International Working Seminar on
Production Economics, February 18-22, 2002.
Igls/Innsbruck, Austria.
Yoon, K., Hwang, C. (1995). Multiple Attribute
Decision Making. A Saga University Paper.
Yi-mei, T., Yan-jie and Mi-fang, W. 2007, Evaluation
and optimization of secondary water supply system
renovation, Journal of Zhejiang University SCIENCE A,
Vol. 8 No. 9 p. 1488~1494.
Bhargava, H.K. and Power, D.J., Decision Support
Systems and Web Technologies: A Status Report.
http://dssresources.com/papers/dsstrackoverview.pdf.
Accessed on 3 January 2007.
Chen, C., 2006, Applying the Analytical Hierarchy
Process (AHP) Approach to Convention Site Selection,
Journal of Travel Research, Vol. 45, No. 2, 167-174
Bertolini, M., Braglia, M., and Carmignani, G., 2006.
Application of the AHP methodology in Making a
Proposal for a Public Work Contract, International
Journal of Project Management, 24(2006) 422-430.
Partovi, F. Y. (1992). "Determining What to
Benchmark: An Analytic Hierarchy Process Approach."
International Journal of Operations & Production
Management, 14(6): 25-39.
Turban, E., Jay, E.A., and Ting-Peng, L. 2005. Decision
Support Systems and Intelligent Systems, United States:
Pearson Prentice Hall.
Rapcsak, T., Sagi, Z., Toth, T., and Ketszeri, L., 2000.
Evaluation of Tenders in Information Technology.
Decision Support Systems. 30(2000) 1-10.
Othman, A., 2004. Pemilihan Bank dari Sudut Servis
Menggunakan Kaedah Proses Hierarki Analisis. Master
Thesis
Sakamon A, B. 2006. Manual Pakej Penilaian Tender.
Cawangan Kontrak dan Ukur Bahan, Ibu Pejabat JKR
Malaysia.
254
15. Saaty, T.L. 1990. Decision Making for Leaders, RWS
Publications, Pittsburgh.
16. Aczel and Saaty, 1983 in Assessing The Value of ELearning system by Yair Levy (2006). Google Book
Result. http://books.google.com/books. Accessed on
January 2007.
17. Kengpol, A and Tuominen, M. 2005. A Framework for
Group Decision Support Systems: An Application in the
Evaluation of Information Technology for Logistics
Firm. Int. J. Production Economics. 101 (2006) 159171.
18. Levy, J. K and Taji. K. 2007. Group Decision Support
System for Hazards Planning and Emergency
Management: A Group Analytical Network Process
(GANP) Approach. Mathematical and Computer
Modelling.
19. Gwo-Jen, H., Tony, H.C.K and Judy, T.C.R. 2004. A
Group-Decision Approach for Evaluating Educational
Web Sites, Elsevier.
20. Fadhilah Ahmad, M Yazid M Saman, N. M. Mohamad
Noor and Aida Othman. 2007. DSS For Tendering
Process: Integrating Statistical Single-Criteria Model
With Mcdm Models. The 7th IEEE International
Symposium on Signal Processing and Information
Technology, Cairo Egypt.
21. Fadhilah Ahmad, Suhailan Safei, Md Yazid Mohd
Saman,and Hasni Hassan. 2010. Integrated Decision
Support System Using Instant Messaging And Enhanced
AHP For Human Resource Selection. The 1st
International Symposium on Computing in Science &
Engineering (ISCSE), June, 3-5, 2010, in Kusadasi,
Aydin, Turkey.
22. Fadhilah Ahmad, M Yazid M Saman, N. M. Mohamad
Noor and Aida Othman. 2007. DSS For Tendering
Process : Integrating Statistical Single-Criteria Model
With MCDM Models. The 7th IEEE International
Symposium on Signal Processing and Information
Technology, Cairo Egypt.
23. Moreira, L. O, Flavio, R.C., Sousa, and J.C. Machado,
2011. A Distributed Concurrency Control Mechanism
for XML Data, Journal of Computer and System
Sciences.
24. Celik, M., Kandakoglu, A., & Er, D. 2009. Structuring
fuzzy integrated multi-stages evaluation model on
academic personnel recruitment in MET institutions.
Expert Systems with Applications, 36(3, Part 2), 6918
6927.
25. Eleftherios siskos, Dimitris Askounis,and John Psarras,
2014. Multicriteria Decision Support for global egovernment evaluation, Omega, 46:51-63
26. Bojan Srdjevic and Zorica Srdjevic, 2013. Synthesis of
individual best local priority vectors in AHP-group
decision making Applied Soft Computing 13:2045
2056
27. M. Tavana, A. Hatami-Marbini, 2011. A group AHPTOPSIS framework for human spaceflight mission
planning at NASA, Expert Systems with Applications
38:1358813603.
28. Y. Dong, G. Zhang, W.-C. Hong, Y. Xu, 2010.

Consensus models for AHP group decision making
under row geometric mean prioritization method,
Decision Support Systems 49:281289.
29. Arpan Kumar Kar, 2014. Revisiting the supplier
selection problem: An integrated approach for group
decision support, Expert Systems with Applications 41:
27622771
30. Bojan Srdjevic, Zorica Srdjevic, 2013. Synthesis of
individual best local priority vectors in AHP-group
decision making Applied Soft Computing 13:2045
2056
31. Osman Taylana, Abdallah O. Bafailb, Reda M.S.
Abdulaala, Mohammed R. Kabli, 2014. Construction
projects selection and risk assessment by fuzzy AHP
andfuzzy TOPSIS methodologies, Applied Soft
Computing 17:105116
32. Saaty, T. L., Peniwati, K., and Shang, J.S. 2007. The
Analytic Hierarchy Process and Human Resource
Allocation: Half the Story. Mathematical and Computer
Modelling.
33. Ratnasabapathy,
S., and Rameezdeen R. 2006. A
Multiple Decisive Factor Model for Construction
Procurement System Selection. Proceedings of The
Annual Research Conference of the Royal Institution of
Chartered Surveyors. The Rics Publisher.
34. Tavana, M., Kennedy, D. T., and Joglekar, P. 1996. A
Group Decision Support Framework for Consensus
Ranking of Technical Manager. Int. J. Management
Science. 24(5):523-538.
255
User request (Admin)
Choose project
Determine no. of
contractors for
evaluation
Determine no.
of DMs
Statistical
Model
Determine DM &
group leader
Who rank
criteria ?
User request
Group Leader
Each DM in the group
Choose project
Group features
Group
Leader?
Yes
Database
No
Each DM
ranks criteria?
?
Assign rank to criteria
Yes
Weighted
Model
No
Display criteria rank
Integrated Model
Operations
Assign scales to GRAHP

model
Calc. priority vector

& weight for each
criterion & tenderer
Use Arithmetic
Means to integrate
weight & produce
ranking
GRAHP
Model Bases
Display ranking of contractors
Figure 2. The Process Flow of GDSS Tendering using GRAHP Model
256
Figure 3. Group Characteristics of GDSS Tendering

Step 1: Assign weight to evaluation criteria
Step 2: Evaluate alternatives
The original values

of the criteria
experience for
each tenderer.
The rank values after

automatic conversion
process from the original
values of experience
criteria.
Display results in tabular & graphical forms.
All the values in this

decision
matrix
are
automatically
entered
using RAHP technique.
The DM can re-judge
these values if the
automatic final ranking
produced
are
unsatisfactory.
Figure 4. A two-step process for GRAHP
257
Building an Advance Domain Ontology Model of Information Science (OIS)

Ahlam F. Sawsaa & Joan Lu
School of Computing & Engineering
University of Huddersfield -UK
swsa2004@yahoo.com & J.lu@hud.ac.uk
ABSTRACT
The paper describes the process of modelling domain
knowledge of Information Science (IS) by creating an
Ontology of Information Science domain (OIS). It also
reports on the life cycle of the ontology building
process using Methontology, based on the IEEE
standard for development software life cycle process,
which
mainly
consists
of:
specification,
conceptualization, formalization, implementation,
maintenance and evaluation. The information resource
used in acquisition and evaluation has been obtained
from Information Science. The conceptualization
consists of identifying IS concepts and grouping them
into a hierarchy tree based on a faceted classification
scheme. The OIS ontology is formalized by using the
ontology editor Protg to generate the ontology code.
The achieved result is OIS ontology which has fourteen
facets: Actors, Method, Practice, Studies, Mediator,
Kinds, Domains, Resources, Legislation, Philosophy &
Theories, Societal, Tool, Time and Space. The model is
evaluated using ontology quality criteria to check the
ontologys usefulness, and how it could be transferred
into application ontology for Information Science
education.
KEYWORDS
Ontology - Knowledge Representation- Semantic web Information Science- Web Ontology LanguageProtg.
1. INTRODUCTION
In recent years ontology has gained attention in
both academic and industrial fields. The word
ontology has been defined in different ways,
originally taken from philosophy, where it means
the basic characteristics of existence in the world.
Ontology is applied in various domains such as
medicine, movies, cooking, and management, to
provide a formal model that structures knowledge.
The Information Sciences IS domain appeared as
an interesting research area due to the fact that it is
a multidisciplinary science emerging from Library

science, documentation and computer science.
Inconsistency in the structure of the IS domain led
to the lack of a unified model of domain
knowledge. This lack makes data at syntax and
semantic level difficult to use and share.
Many technologies offer a good solution for data
sharing at syntax level, instance XML, but it
cannot solve. Ontology offers a good solution for
data use and sharing at semantic level. Ontology is
a moulding tool that provides a formal description
of concepts and their relations as a foundation for
semantic integration and interoperability.
The lack of domain ontologies in computer-based
applications has led to loss of knowledge in
specific domains. In this sense, the problem is vital
for scholars and researchers, who need to access
information in efficient ways to meet their
interests. The problem has been defined to as
requiring an artifact for a solution. Ontologies can
lead to solutions to this problem due to the fact
that they give some sort of notion of meaning
about terms. It has the potential to overcome the
problem and make the conceptualization of
domain Information Science explicit and
understandable.
Information Science IS as an interdisciplinary
science needs to be defined. However, it became
necessary to develop OIS ontology to represent the
domain knowledge. The ontology of Information
science is discussed in this paper.
The goal of the paper is to study the terminology
of information science to create domain ontology.
Many ontologies have been created and published.
However, the OIS ontology is missing. OIS
ontology is a new research direction in the IS field.
This study is devoted to clarifying the basic
concepts and framework of IS, in order to develop
a taxonomy of the IS domain model. It presents a
formal semantic explanation for IS meta-data. This
258
paper is organized as follows: in section 2 we
discuss the theoretical foundation of ontology. In
section 3 we discuss the method of building IS
ontology and how it has been constructed. Section
4 presents ontology of Information Science
development and implementation, followed by
discussion and evaluation in section 5. Finally, the
conclusion and future work will be presented.
2. BACKGROUND
2.1. Ontology Definition
Ontology has different definitions in available
literature [1,2,3]. Basically, ontologies are used in
different communities. Ontology emerged from
the philosophical field as an area of study
introduced by Aristotle. In recent years this term
has borrowed from computer science community
uses to represent the knowledge required to
understand the real world. Developers have been
developing a conceptual base for building
technology to construct knowledge components to
be reusable and sharable.
So, ontology has been defined from different
perspectives. The philosophical perspective
defined ontology as the science or study of
being[4], while the Artificial Intelligence (AI)
community defined it in 1991 :
Ontology defines the basic terms and relations
comprising the vocabulary of a topic area as well
as the rules for combining terms and relations to
define extensions to the vocabulary. [5]. This
definition is a brief definition, which indicates that
ontology is providing definitions of terms that are
explicitly defined and the relations and rules to
unite them, but ontology is more than that. It can
provide inferred new terms using the rules. In
1993 Gruber defined ontology as: An ontology is
a specification of a conceptualization. [6].
His definition has been developed to be more
accurate for defining ontology which is: Formal
explicit specification of shared conceptualization
The definition can be explained as follows:
A formal: ontology should be machine readable
and processed by an Artificial Intelligent (AI)
system. We do not need there to be
communication devices between people and
people even people and machine. Ontology should

be formally defined using a formal language [7].
Specification: means writing specifications of
language syntax to satisfy certain criteria such as
precision,
non-unambiguity,
consistency,
completeness and implementation as independent
statements [8]; it should provide a communication
device to enable users to share knowledge in
consensual mode.
Shared: means ontology represents a consensual
knowledge that has been arranged and agreed on
by groups typically as the result of a social
network rather than an individuals view.
Conceptualisation: this is an abstract model of
domain knowledge driven by application for users,
and represents ways in which it is committed by
knowledge- based systems.
We can formulate definitions that we can
understand based on the above; ontology should
be formally defined to process by machine. The
ontology is a specific type of information object or
artifact. The ontology construction refers to clear
classes, relations and their instances, which play
roles of explicit specification of conceptualisation.
In other words, the back bone of the ontology is
composed of the specification of concepts.
However, ontology is not software and it cannot
run as a program, but it can be used by programs.
A far more interesting question is what
information
systems
could
learn
from
philosophical ontology. It is a shared belief there
is a similarity inherent in ontology from
philosophical and applied scientific perspectives.
Philosophical ontology is describing the real world
as it exists, while computational ontology is
describing the world as it should be [9].
According to Grubers definition(1995) OIS
ontology is the formal explanation of shared
conceptualization of the IS domain. The concepts
in IS are represented by the ontology model. It is
more interesting that the IS knowledge will be
conceptualized by defining classes and certain
relationships, to make it machine readable [10]. In
this paper we focus on the conceptual ontology
259
that is being used in the semantic web. The aim of
this study is:
1. Providing a visualisation of the Information
Science area.
2. Sharing a common understanding
Information Science theory.
of
3. Describing the terminology of a conceptual

model of Information Science by describing
the concepts, their instances and their
properties [11].
2.2. Domain Ontologies
The number of studies of ontology has been
growing rapidly in recent years. Gartner points out
that integration of the semantic web could be the
greatest impact on the technologies in the next few
years. Ontology is used a basis for enabling
interoperability through the semantic web [12].
Bhatt shows an approach of extracting subontology to meet the user needs, based on the
unified medical language system (UMLS), by
designing onto Move to exploit the semantic web.
It used language of RDFs and OWL [13]. Onto
CAPE is a large scale ontology for chemical
processes for use in an industrial field [14]. Du et
al have proposed onto Spider which is a novel
ontology for extracting ontology from the HTML
web; nevertheless, the lexical semantics and
natural language have a negative effect on the
result due to a difference of outcome when words
or links are missed [15], [16], [17].
Domain ontology plays a vital role by defining
terms which could be used as meta-data. Sabous
work is about creating ontology from an OWL-s
file for describing a web service [18], particularly
in a specific domain such as biomedical ontologies
which play a fundamental role in accessing the
heterogeneous sources of medical information, and
using and sharing patients data. GALEN
(Generalised
Architecture
for
Languages
Encyclopaedias and Nomenclatures) provides
reusable terminology resources for clinical
systems. It contains 25,000 concepts used to
represent a complex structure of descriptions of
medical procedures [19]. Furthermore, GENE
ontology (GO) was developed by the National
Human Genome Research Institute in 1998. It

presents a control vocabulary of gene and gene
products attributes. It contains (30,000) concepts
and is organized as follows; biological process,
molecular function, and cellular component. The
GO ontology is regularly updated and it is
available in several formats [20], [21], [22].
Standardized Nomenclature for Medicine- clinical
terminology (SNOMED) is ontology for health
care terminology. It contains 350,000 terms that
represent clinical meanings. Each concept has a
number, ID and full specific name (FSN).
SNOMED has the ability to automate functions
related to medical record administration and to
facilitate data collection for research purposes
(Jepsen, 2009). Toronto Virtual Enterprise
(TOVE) was developed in the Integration
Laboratory at University of Toronto. It provides a
shared terminology to be understood and shared
between commercial and public enterprise. TOVE
is implemented in C++ and Prolog for axiom. It
covers activities, time, parts and resources [23].
Economic ontology is constructed to define the
economic domain from economic documents. It
uses the OntGen tool to semi-automatically
construct ontology. The ontology is based on
machine learning methods [24].
2.3. Methodology Employed
2.3.1. Theoretical Bases
The nature of the ontology is a concept model. The
concept model represents the relationship of
concepts within the domain; to gain a better
understanding of OIS ontology development and
its role in the semantic web, the framework is
established to describe the main theoretical base.
The methodology is based on Category Theory,
which is foundation theory for mathematics. A
number of thinkers such as Hartmann and Husserl
asserted that ontology relies on category
theory.[25], [26]. Thus, ontology is a model of a
domain,
O ={C, R, A}, Where
C is a concept
R C C, Where R is relations. For r= (C1, C2 ) R. Or
R(C1)= (C2)
260
A is a set of axioms on O
[27]
Whereas, Concepts: set of entities within a

domain. Relations: interactions between concepts
in the domain. Axioms: explicit rules to constrain
the use of concepts. Instances: concrete examples
of concepts in the domain [28,29,30].
2.3.2.
Techniques and Tools
2.3.2.1. Web Ontology Language (OWL)

OWL is designed to represent information about
objects and how these objects are organised and
interrelated within specific domains. OWL is
derived from descriptive logic that aims to bring
reasoning and expressive power to the semantic
web.
2.3.2.2. Protg
Protg is an open source tool that was developed
at Stanford University by Stanford Medical
Informatics. The core of Protg is an ontology
editor, which provides a suite of tools to construct
domain models using various formats. It can also
be extended by using plug-ins to add more
functions such as import and export ontology
language (XML, OIL, FLogic). The platform of
Protg is supporting two ways of modelling
ontologies.
Building Information Science ontology OIS
follows Methontology based on IEEE standard
criteria to design an ontology life cycle process.
The IEEE 1074-2006 is a standard for developing
a software project life cycle process [31],[32],
using methodology to capture the domain
knowledge and to establish the creation of the
glossary of domain knowledge [33].
a- Determining the domains scope,

interest, goals, strategy and boundary, which need
to answer the following questions:
Q1. What are the general characteristics of
ontology of IS?
Q2. What is the purpose of ontology of IS?
Q3. Will it cover the general domain or specific?
Q4. What about its size and formalism used?
Q5. Does it use formal axioms or order logic?
To answer these questions we should describe the

contents of the ontology. These contents includes:
taxonomic structure, concepts it will covered, toplevel division, and the internal structure of the
concept.
b. Acquiring domain knowledge and
developing the glossary that contains the key
concepts in the field. It requires the integration of
all relevant terms, which include concepts,
instances, attributes, relations
c. Create concepts dictionary to identify
the terminological concepts and relations
d. Modelling concepts in a hierarchical
taxonomy and their relations( subclasses, super
classes)
2. Convert the conceptual model to
Computational model, which starts from:
b. Formalising Ontology by ontology Protg
editor.
c. Evaluation and maintenance of the
computational model.
d. Documentation of the ontology life cycle,
is shown diagrammatically in Figure 1.
Methontology is a chosen methodology to develop

the Information Science ontology OIS. This
methodology uses an iterative approach which
allows us to refine the ontology to create a more
accurate model of the IS domain. The OIS
ontology methodology is constructed as follows:
1. Building the conceptual model which is
established from:
261
Determine domain scope &
interest
Acquiring domain resources&

building domain glossary
organises them in super-types and sub-types of

hierarchy.
OIS ontology is encoded by the Protg editor to
formalize the OIS, due to the fact that Protg
provides plug-ins and play environments for
developing
prototypes
and
applications.
Furthermore, ontology in Protg can be exported
to different formats, as seen in List 1.
List (1) OIS ontology in OWL.
Modelling concepts hierarchy

taxonomy and their relations
Formalize ontology by OWL
Consistency checking Fact++
Figure 1 Methods of creating OIS
3. IMPLEMENTATION
3.1. Conceptual Model of OIS Ontology
The conceptual model reflects on the
computational model. It could be a communication
device for experts in the domain. It shows the
entity classes, attributes and their relationships
with OIS ontology. We develop the main
relationships among the defined classes. The
conceptual model is a base of domain ontology
which helps to build the OIS ontology. Figure 2
shows the conceptual model of domain ontology.
<rdf:RDF
xmlns="http://www.semanticweb.org/ontologies/2011/1/On
tology1298894565306.owl#"
xml:base="http://www.semanticweb.org/ontologies/2011/1
/Ontology1298894565306.owl"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:owl2xml="http://www.w3.org/2006/12/owl2xml#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntaxns#"
Xmlns:
Philosophy="&Ontology1298894565306;Philosophy&"
xmlns:Ontology1298894565306="http://www.semanticweb.
org/ontologies/2011/1/Ontology1298894565306.owl#">
<owl:Ontologyrdf:about="http://www.semanticweb.org/ont
ologies/2011/1/Ontology1298894565306.owl#">
<rdfs:comment>Information Science ontology that
describes the domain of IS.</rdfs:comment>
<dc:creator xml:lang="en
>Ahlam Sawsaa 2011.</dc:creator>
</owl:Ontology>
The root class in OWL is thing (owl: Thing) which

is the root of all classes such as Resources in RDF
( rdfs: resources) The list below displays a simple
hierarchy of the main classes of OIS ontology by
OWL language, as shown in diagram 2.
3.2. Computational Model of OIS Ontology

OIS ontology is structured in natural language to
be suitable for data modelling and knowledge
representation. It is intended to express the
unambiguous and complete specification of
domain concepts. It provides a dictionary of
concepts with relations between them and
262
Furthermore, the current version is defined by a
large number of classes - about 687 - and consists
of approximately 170 assertions including more
than 67 rules and relations to determine the rich
semantic expression capability of the language.
4. EVALUATION
Ontology evaluation means taking into
consideration that which guarantees the stability
and accuracy of the ontology. Evaluation of the
ontology
avoids
concept
duplication,
excessiveness and inconsistent relationships to
create a better understanding. In this study the
evaluation process is based on interim and
completion evaluation. The evaluation is used at
development stage to improve the design and
implementation of the project. The OIS ontology
was evaluated during the development process to
ensure its completeness and consistency of
meaning.
Figure 2 Ontology of OIS
Upper- level classes:

The Upper-level of classes was created based on a
taxonomy of IS. The OIS ontology is basically
organized into several classes that correspond to
different kinds of things that describe the science.
The first layer is a meta- class level has concepts;
Actors, Domains, Kinds, Practice, Studies,
Mediator, Method, Resources, Legislation,
Philosophy & Theories, Tools, Societal, Time&
Space as shown in Figure 4. Each sub class is
grouped under main upper-class such as
Education of Information science, Education of
Computer Science, Education of Library
Science, all grouped under the Education class.
The OIS was evaluated by the domains experts to

identify their level of satisfaction, based on
predefined criteria. The first criterion was
ontology consistency. (64%) of respondents
indicated level 3 of satisfaction, and others
expressed levels 2 and 4 by (20%,12%)
respectively. The second criterion was consistency
of is-a and part-of relationships. (14) of the
participants indicated their satisfaction with the
consistency of ontology relations at level 3 (56%)
while 6 of them (24%) pointed to level 2. For the
third criterion the majority of participants
identified level 3 to indicate their level of
satisfaction with assessing completeness of OIS
ontology which is (48%), in comparison with
levels 1 and 5.
The fourth criterion was clarity of OIS ontology,
as illustrated in Table 1.
263
Table 1 Evaluation criteria of OIS ontology
Ontology Criteria
Percentage
Consistent of ontology
0.64%
Consistency of is_a and

part_of_relationships
0.56%
Completeness
0.48%
Clarity
0.40%
Generality
0.44%
Semantic data richness
0.48%
The FaCT++ was implemented to ensure errors

free that the in ontology syntax was errors-free. If
there is a class incorrectly classified it will appear
in red colour in a root class called Nothing, e.g.
the
class
Analytics,
ArchitectureLibrary,
Dissemination, and DocumentationCenter appears
inconsistent in the class category. They appeared
as main classes that were organised under the main
root (Nothing).
The subclass of ElectronicDocumentdelivery and
Information Diffusion are classified by the
reasoner under the Domain while they are a
subclass of Information Service that is structured
under the Practice class.
It also ensures there are no confounding and
contradictory concepts, and ensuring that terms
have consistency of meaning with clarity.
Ontology should provide mapping according to the
meaning of its contents. However, the consistency
and the syntax of the generated OWL file can be
verified by using an OWL ontology validator. The
OIS ontology was verified by using OWL
validation as well, for more testing and validation.
Once the ontology was uploaded to the validator,
the abstract syntax Full OWL - form says
Yes:Why, which means the ontology has
succeeded and the results are good. Figure 3
shows a segment of the verification results.
Figure 3 OWL validation
5. DISCUSSION AND ANALYSIS

The OIS ontology that represents the domain
knowledge is introduced in this paper. It enables
us to understand the domain knowledge and the
conceptual relationship between its branches. The
theoretical base of the model is based on a faceted
analytic-synthetic system. The model is structured
around the domain conceptualization based on
Methontology. The OIS is structured from 14
meta-classes that are based on fact classification,
to define the key elements of the domain and
possibly to be linked with other domains. This
structure can be used for structuring IS, organising
the sub-classes in the domain. For example, the
meta-class (Mediator) structures all types of
mediator in the IS domain such as: Archives,
Libraries, media Centres, Documentation centres,
Information Centres, Museums and Websites.
Meanwhile the class (Library) could be extended
to offer the following list of sub-classes; such as
Library Academic, University library, College
library , Higher education institutions, Department
library , Library International, Library School .
264
The current version of OIS ontology contains 687
classes - about 44% - and 700 subclasses; about
45%.
In addition, we note that classes and subclasses
feature in the OIS more than other components
such as data property that is 1%, object property
which is 4% while individuals make up 6%. This
is because this model is a generic model that
structures the IS domain as the base of application
ontologies that will be developed. The model has
data properties that indicate the semantic relations
between classes and subclasses. The model has
different relationships (object property) such as,
hierarchical relationships (isPartOf, IsA, ), inverse
relationships ( hasA, hasPart),equivalency and
associative relationships. These relationships are
representing the core relations between the
concepts. In comparison to other relations the
hierarchical relations were used more than their
functional equivalent, while the transitive relations
were used less than others. We describe some
classes of the OIS ontology to clarify some of
these relationships.
6. CONCLUSION
6.1. Achievements
The following are the main achievements
presented through this study:
- Constructing an Ontology of Information
Science (OIS), and methods of building it.
- Demonstrating the strategy of building and
designing the conceptual model in the domain
using ontology technique. Ontology of
information science will help to identify the
features of this science, which mainly consist
of the overlapping sets of science that make it
difficult to determine its boundaries.
- The resulting ontology covers three main
areas of the domain knowledge: library
science, archival science and computing
science. The vocabularies of these branches
are formalized in class hierarchy with
relations which are interconnecting concepts
from all these areas, in order to define a
sufficient model of the Information Science
domain.
- The phases of the methodology were
specification,
knowledge
acquisition,
conceptualization,
formalization,
and
evaluation, of all which are essential in
order to attain the results. The domain
knowledge was formalized by using the
Protg ontology editor, which can also be
used to automatically generate the
ontology code. The ontology was evaluated
and validated by using FaCT++ reasoner.
The evaluation report was used to check if
the ontology was consistent and satisfied
needs.
6.2. Future Works
The reusing, sharing, and maintaining of the
ontology for future issues that relate to our
ontology need to be considered. In the OIS module
there is always space for improvement, at least
adding the additional of new or missing concepts
and adding new classifications based on different
criteria and perspectives. Although most
Information Science concepts were considered,
there are concepts that need to be added. Another,
more interesting, possibility would be to link this
general model with others that are related to the
domain in order to be integrated with other
ontologies in an ontology library to use for
specific applications.
7. REFERENCES
1.
2.
3.
4.
5.
6.
7.
NOY, N. F. & MCGUINNESS, D. L. Ontology

Development 101: A Guide to Creating Your First
Ontology. Portage 2000, (2001).
FENSEL, D. Ontologies: Silver Bullet for
knowledge Management and Electronic commerce.
, Springer (2001).
STARLAB, F. Systems Technology and application
Research Laboratory home page. . Faculty of
Science, Department of Computer Science , Vrije
University Brussel. (2003).
WEBSTER Definition of Knowledge. Webster's
third
new
international
dictionary.
http://www.merriamwebster.com/dictionary/knowledge?show=0&t=131
6553888(2011).
NECHES, R., FIKES, R., FININ, T., GRUBER, T.,
PATIL, R., SENATOR, T. & SWARTOUT, W.
Enabling Technology for Knowledge Sharing. AI
Magazine, 12, 36-56. (1991)
GRUBER, T. R. A translation approach to portable
ontology specifications. Knowledge Acquisition, 5,
199-220. (1993)
MORBACH, J., WIESNER, A. & MARQUARDT,
W. OntoCAPEis a large scale ontology for
265
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
chemical process engineering.

Engineering
Applications of Artificial Intelligence. Computers
& Chemical Engineering, 20, 147-161(2009).
TURNER, J. G. & T.L, M. The construction of
formal specifications An introduction to the modelbased and Algebraic Approaches, McGRAW-HILL
England. (1994)
KABILAN, V. Ontology for Information Systems
(O4IS) Design Methodology: Conceptualizing,
Designing and Representing Domain Ontologies.
School of Information and Communication
Technology, Department of Computer and Systems
Sciences. The Royal Institute of Technology.
(2007)
GRUBER, T. R. Towards principles for the design
of ontologies used for knowledge sharing.
International Journal of Human-Computer Studies,
43, p
P.p 907-928. (1995)
SAWSAA, A. & LU, J. Ontology of Information
Science Based On OWL for the Semantic Web. In:
International Arab Conference on Information
Technology (ACIT'2010). University of Garyounis,
Benghazi, Libya (2010).
GARTNER, N. R. emerging technologies hype
cycle highlights key technology themes (2006).
BHATT, M., RAHAYU, W., SONI, S. P. &
WOUTERS, C. Ontology driven semantic profiling
and retrieval in medical information systems. Web
Semantics: Science, Services and Agents on the
World Wide Web, 7, 317-331. (2009)
TENNANT, N. Parts, Classes and Parts of Classes :
An Anti-Realist Reading of Lewisian Mereology
The SAC conference on David Lewis's
contributions to formal philosophy September 2007.
Copenhagen (2007)
DU, T. C., LI, F. & KING, I. Managing knowledge
on the Web - Extracting ontology from HTML
Web. Decision Support Systems, 47, 319-331.
(2009).
FERNANDO BATISTA, RICARDO RIBEIRO,
JOANA PAULO PARDAL, MAMEDE, N. J. &
PINTO, H. S. Ontology construction: cooking
domain. Artificial Intelligence: Methodology,
Systems, and Applications, 41, pp. 1-30. (2006)
BRUSA, G., CALIUSCO, M. L. & CHIOTTI, O. A
Process for Building a Domain Ontology: an
Experience in Developing a Government Budgetary
Ontology. Conferences in Research and Practice in
Information Technology,. Hobart, Australia. (2006)
ALANI, H., OHARA, K. & SHADBOLT, N.
ONTOCOPI Methods and Tools for Identifying
Communities of Practice. Intelligence, Agents,
Multimedia Group. (2005)
TROMBERT-PAVIOT, B., RODRIGUES, J.,
ROGERS, J. & BAUD, R. GALEN: A Third
Generation Terminology Tool to support
multipurpose National Coding System for surgical
procedures. International Journal of Medical

Informatics, 58, pp. 71- 85 (2002)
20. GASEVIC, D., DEVEDZIC, V. & DJURIC, D.
(2006) model driven architecture and ontology
development Berlin, Springer.
21. GENEONTOLOGY (2009) Welcome to the Gene
Ontology
website!
,
http://www.geneontology.org/
22. JEPSEN, T. C. (2009) 2009 just what is an
ontology. IEEE computer society, 11.
23. LABORATORY, E. I. TOVE Ontology Project.
University
of
Toronto.
Toronto,
http://www.eil.utoronto.ca/enterprisemodelling/tove/.(2011)
24. VOGRINCIC, S. & BOSNIC, Ontology- based
multi-label classification of economic articles.
ComSIS, Vol8, pp.101-119. (2011).
25. BELLO, A. Ontology and Phenomenology. IN
POLI, R. & SEIBT, J. (Eds.) Theory and
Applications
of
Ontology:
Philosophical
Perspectives. London New York, Springer (2010).
26. HARTMANN, N. (1952) The new ways of
ontology, Chicago.
27. GMEZ-PREZ,
A.,
CORCHO,
O.
&
FERNNDEZ-LPEZ, M. Methodologies, tools
and languages for building ontologies. Where is
their meeting point? Data & Knowledge
Engineering, 46, 41-64. (2003)
28. POLI, R. Ontological methodology. International
Journal of Human-Computer Studies, 56, 639-664.
(2002)
29. SABOU , M., WORE,C. GOBLE,C. MISHNE,G.
Learning domain ontologies for web service
descriptions; an experiment in bioinformatics.
www2005. china, Japan. (2005)
30. SAWSAA, A. & JOAN, L. Virtual community of
practice Ontocop: Towards a New model of
Information science ontology (ISO). International
Journal of Information Retrieval Research,, 1, pp.
55-78. (2011)
31. IEEE, S. IEEE standard for developing software life
cycle process, New York(USA), IEEE Computer
Society. (1996)
32. FERNNDEZ-LPEZ,
M.
Overview
of
methodologies for building ontologies. In:
Workshop Ontologies and Problem-Solving
Methods : Lessons Learned and Future Trends de la
conferencia International Joint Conference for
Artificial Intelligence (IJCAI99), August 1999.
Stockholm, Sweden. (1999).
33. HERRE, H. The Ontology of Mereological
Systems: A Logical Approach. IN POLI, R. &
SEIBT, J. (Eds.) Theory and Applications of
Ontology: Philosophical Perspectives. London
,New York, Springer (2010).
266

IJDIWC - Volume 4, Issue 2

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

IJDIWC - Volume 4, Issue 2

Caricato da

Copyright:

Formati disponibili

A NEW APPROACH TO WIRELESS CHANNEL MODELING USING Divya Choudhary, Aaron L.

TOWARDS CARBON EMISSION REDUCTION USING ICT

INTERFERENCE ANALYSIS AND SPECTRUM SENSING OF MULTIPLE Sowndarya

ROBUST NONLINEAR COMPOSITE ADAPTIVE CONTROL OF

A NEW ORTHOGONAL CRYPTOGRAPHIC SYSTEM FOR DATABASE

DESIGNING AND IMPLEMENTING BI-LINGUAL

MOBILE Hassanin M. Al-Barhamtoshy,

Fadhilah Ahmad, M Yazid M

BUILDING AN ADVANCE DOMAIN ONTOLOGY MODEL OF

A New Approach to Wireless Channel Modeling using Finite Mixture Models

Divya Choudhary1 and Aaron L. Robinson2

relevant communication channel. This evaluation

7, while section 8 discusses conclusions of the

The multipath signals arriving at the receiver are a

In equation 3, the independent variable t indicates

If it is assumed that the channel impulse response

A probing pulse p(t) is transmitted to measure the

response, |n|, the UWB channel is used as a case

where fh and fl are the lowest and highest -10 db

forming a weighted combination of the individual

out sine waves in the 2-8 GHz frequency range in

Figure 1. Data collection layout in University environment

Figure 2. Normalized impulse response measured in a

Thus, at the receiver, transfer function of the

Figure 3. Data collection layout in an industrial environment

5.1 Non-mixture models and

The choice of the five probability density functions

where x is the data and is parameter of the

5.1.2 Nakagami distribution

Figure 4. Normalized impulse response measured in an

5. PARAMETER ESTIMATION FOR NONMIXTURES DISTRIBUTIONS AND FINITE

represents the gamma function, m represents the

5.1.3 Rice distribution

When describing channels, the rice distribution is

5.1.5 Lognormal distribution

where the relationship between , and K, is

The parameter a is called the shape parameter while

The value of K can be estimated by solving the

and are the parameters of the lognormal

5.2 FMMs and parameter estimation of FMMs

5.1.4 Weibull distribution

In this paper, the Stochastic Estimation

Step 1: As in the basic EM algorithm, in the nth

Figure 5. Histograms of showing multimodality in UWB

The FMMs represent the PDF of a variable X as a

In the first iteration, an initial guess is made for

p(X|) represents the finite mixture model of

arg max log P X k | k

arg max log P x in,k | k

The new value of k for the (n+1)th iteration is the

for complexity arises from the idea that complex

f ( x)log( f ( x)) log( p( x))

I(f,p) represents the loss information when p(x) is

6. Model Selection Techniques

This section provides an overview of three model

K is the number of parameters in the distribution

Wagenmakers et al. state that the sole requirement

7. UWB channel model selection results

belonging to non-mixtures, 2-component FMM,

Table 2. Performance of 2-component FMMs

Table 5. Performance of 2-component FMMs (industrial

Table 6. Performance of 3-component FMMs (industrial