Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Conference Dates
September 18-20, 2017
Conference Venue
Lodz University of Technology, Lodz, Poland
ISBN:
978-1-941968-43-7 2017 SDIWC
Published by
The Society of Digital Information and Wireless
Communications (SDIWC)
Wilmington, New Castle, DE 19801, USA
www.sdiwc.net
Table of Contents
Case Studies
A Neural Network Approach for Attribute Significance Estimation.... 21
Computer Vision
Multi Feature Region Descriptor Based Active Contour Model for Person Tracking. 50
Data Mining
Approaches for Optimization of WEB Pages Loading via Analysis of the Speed of
Requests to the Database.............. 58
Examining Stock Price Movements on Prague Stock Exchange Using Text Classification.. 64
This paper is a means in which a complex data structure database system and programming language. According to
was solved to a simple comprehendible data flow and the process flow mentioned in the introduction which
representations. The sample structure used is the clumsy involves creating a RDBMS using Persons-Students school
school structure which is divided into two super classes of structure and relationship as a sample using SQL, creating a
school units and persons. The school unit comprises of conversion Java program using JDBC API to map and
organization unit class, educational unit class and resources transform the relational database metadata to standard
class while the persons comprises of the students class, Ontology description through the help of DOM/XML and
academic staff class and non-academic staff class. These importing the created Ontology description to an Ontology
classes also have sub-classes and schemas with relationships. editor (Protg) to form a standard Ontology structure.
However, with the time frame of research and other
limitations, a simpler structure of persons and modules super
classes were implemented. Relational Model Semantic Model Database
Structure
B. SemanticModel
Semantic model shows the meaning of its instances from
the ER model. It is a conceptual data model that includes
expressing information that enables parties to the
information exchange to interpret meaning (semantics) from
the instances, without the need to know the meta-model [12].
Students Professors
Nep_no P_ID Grade Staff_ID P_ID Dept
PK FK PK FK
Int Int Varchar2(20) Int Int Varchar2(20)
Persons
P_ID Name Age
PK NN
Int Varchar2(20) Int
Modules
M_Code Nep_no Staff_ID M_Title M_Unit Description
PK FK FK
The ER model is represented as four semantic tables data needed to retrieve the users query are called from the
which will be implemented in an Oracle database using SQL sources [8].
(Structured Query Language). It is easily identifiable that the
entities are represented as table names, the attributes as Output 1 below shows the sample mapping result
column names and the relationship between the tables between the database and Ontology. This output shows the
showed using foreign key constraints. For example, the metadata tables converted to Classes, ObjectPropertyDomain
modules table has columns for students (Nep_no) and and ObjectPropertyRange whuich are various OWL
professors (Staff_ID) to show the students that study a parameters.
particular module as well as the professors that handle the
modules. <Declaration>
<Class
C. JDBC: IRI="file:/C:/JDeveloper/mywork/Test1/Cl
ient/Out1.xml#STUDENTS"/>
Java Database Connectivity (JDBC) is an Application </Declaration>
Programming Interface (API) for the programming language <ObjectPropertyDomain>
Java, which defines how a client may access a database to <ObjectProperty
create, insert into, update and query tables [11]. The JDBC IRI="file:/C:/JDeveloper/mywork/Test1/Cl
establish a connection to the database; execute query and ient/Out1.xml#P_ID"/>
covert to Java variables with some methods to aid the <Class
conversion [6]. IRI="file:/C:/JDeveloper/mywork/Test1/Cl
ient/Out1.xml#STUDENTS"/>
The result of the query is accessed by the Document
Object Model (DOM). The DOM defines a standard for </ObjectPropertyDomain>
accessing Extensible Markup Language (XML) documents. <ObjectPropertyRange>
The W3C DOM is a platform and language neutral <ObjectProperty
interface that allows programs and scripts to dynamically IRI="file:/C:/JDeveloper/mywork/Test1/Cl
access and update the content, structure, and style of a ient/Out1.xml#NEP_NO"/>
document. The extracted content in an XML format is
imported to an Ontology Web Tool (Protg).
<Class
IRI="file:/C:/JDeveloper/mywork/Test1/Cl
ient/Out1.xml#STUDENTS"/>
Java Application JDBC API </ObjectPropertyRange>
<DataPropertyDomain>
<DataProperty
JDBC Driver IRI="file:/C:/JDeveloper/mywork/Test1/Cl
JDBC Driver
Manager ient/Out1.xml#GRADE"/>
<Class
IRI="file:/C:/JDeveloper/mywork/Test1/Cl
Oracle, SQL Server, ODBC Data Source ient/Out1.xml#STUDENTS"/>
</DataPropertyDomain>
<DataPropertyDomain>
Fig. 3. Architectural Diagram <DataProperty
IRI="file:/C:/JDeveloper/mywork/Test1/Cl
ient/Out1.xml#NEP_NO"/>
D. Mapping DatabasetoOntology:
<Class
IRI="file:/C:/JDeveloper/mywork/Test1/Cl
The goal is to create mapping between an Ontology and a ient/Out1.xml#STUDENTS"/>
legacy database. Various levels of overlap between the </DataPropertyDomain>
database domain and the Ontologys are seen. The mapping <DataPropertyRange>
is done using two steps, namely: <DataProperty
Mapping definition: This is the transformation of data IRI="file:/C:/JDeveloper/mywork/Test1/Cl
schema into Ontology structure. ient/Out1.xml#GRADE"/>
<Datatype
Data Migration: The migration of database instances into abbreviatedIRI="xsd:VARCHAR2"/>
ontological instances. In this project, the process is query
</DataPropertyRange>
driven. That is the database instances are transformed as a
result of a response from a given query. This means that only
Output 1. Demonstration of the relational data mapped to Ontology
The design steps illustrates diagrammatically the process The main method instantiates prog to independently
flow, execution and conversion of variables which manipulate the get_class_name(), get_data_prog(), and
eventually becomes OWL with the help of DOM. get_object_prog() and createOntology methods respectively.
System.out.println("--------");
prog.createOntology();
RowsSelect SQL
Java program Java }
JDBC
Program
Each of these mentioned methods above have separate
Execute query task and query to execute as well as independent connections
and convert to to the database. Below is the sample:
Java variables
ResultSet rs = null;
Ontology Web Language Protg
String cmd = "";
Class_desc obj;
The code has a public class that inhabits the entire code. try {
The class consists of many methods and also initialized array connection =
lists and database connection. There is also a main method testClass2.getConnection();
used by the JVM to start execution.
These sample codes above execute the SQL codes and transformer.transform(source, result);
get the required data from the database apart from the
createOntology method which its job is to get record from The document builder aids the formation of these data in
the these methods above to save and get them ready for Ontology format and get it transformed with the help of these
transformation to OWL format. The sample createOntology transformation code using DOM source. The result saved in
method is shown below: a directory is imported using protg.
REFERENCES
[12] Tutorialspoint.com,Relation data model, 2014, Accessed March 5,
2017.
[1] Fensel D, Ontology: a silver bullet for knowledge management and
electronic commerce Springer, 2001. [13] Web Ontologylanguage OWL,
http://www.obitko.com/tutorials/Ontology-semantic-web/web-
[2] Frye, M., "Applications of Ontology in heterogeneous multi- Ontology-language-owl.html, Accessed March 6, 2017
tiernetworks for network management", Theses and Dissertations
2012, Paper 1118. [14] Wikipedia, Protg (Software), 28 August 2015, Accessed March
6, 2017.
[3] Gruber T.R, A translation approach to portable Ontology
specifications, knowledge acquisition, 1993, 5:199220. [15] XML RDF, http://www.w3schools.com/xml/xml_rdf.asp; Accessed
March 5, 2017
[4] Irina A, ,Storing OWL Ontology in SQL relational databases, 2007.
Combined neural network model for real estate market range value
estimation
for 60% we get 20th and 80th 3) We select price range accuracy.
percentiles) 4) Corresponding percentiles are
3) These values are the resulting price selected.
range. 5) Percentiles values are added to the
It is obvious that this algorithm will value from step 2)
always return correct results for learning 6) The result of step 5) is objects price
dataset, meaning that for selected accuracy (e.g. range.
60%) the ratio of objects with correct price Usual price assessment is done as
range will be the same (same 60%). follows:
But for the testing dataset that was used 1) Object is clustered by first level
by authors, the result is a bit worse, but still network
close to the selected accuracy. 2) Second level network calculates its
Despite k-means algorithm multilayer value according to objects
perceptron can use all characteristics which are characteristics.
present in learning dataset, but it didnt effect 3) Value from step 2) is added to mean
on its results. For dataset that contains prices of value for cluster selected on step 1)
Ukrainian real estate market, k-means and 4) The result is objects price.
multilayer perceptron showed similar accuracy.
Per the results of two described Testing results.
algorithms authors decided to combine these
Testing was divided in two parts.
two approaches into one two-level neural The first part is usual price assessment
network model. (when algorithm returns only single value). For
For the learning of this model we use the this task, it was compared three algorithms: k-
following algorithm: means, classical multilayer perceptron with one
1) On the first level data are clustered hidden layer and two-layer neural network
with Kohonen network using only model.
objects coordinates. The details of algorithms are as
2) Average price is calculated for each following:
cluster. 1) k-means algorithm: uses only two
3) Separate perceptron is created for objects coordinates, k = 400. This
each cluster. It uses all characteristics algorithm is mentioned below as
except price as inputs and the K400
difference between the real price and 2) multilayer perceptron: has one hidden
the average price in cluster as an layer with 100 neurons, uses all
output. objects characteristics as input (one
4) On the second level data is processed neuron in input layer per
by perceptron of cluster detected on
characteristic), has one neuron in
the first level. output layer that returns price per
5) Each object in cluster is passed to its square meter. This algorithm is
clusters perceptron and calculated mentioned below as N100
value is subtracted from objects 3) two-layer neural network model: has
price. 40 clusters on the first level and
6) All these results are saved in cluster. multilayer perceptron with 10
7) All percentiles are calculated per neurons in hidden layer for each
these values. cluster. This algorithm is mentioned
Price range assessment is done as below as G40/10
follows: The following metrics are used to
1) Object is clustered by first level compare results of usual price assessment:[11]
network
=1| |
2) Second level network calculates its 1) Average error
value according to objects
characteristics. This value is saved.
The second part is price range assessment
(when algorithm returns price range with Two-layer NNM
1000 1000
800 800
600 600
400 400
200
200
0
1 2 3 4 5 6 7 8 9 0
1 2 3 4 5 6 7 8 9
Two-layer NNM k-means
Two-layer NNM k-means
Diagram 2. Price range assessment comparison Diagram 3. Price range assessment comparison
for learning dataset of Average range for test dataset of Average range
As we can see from Table 2 and Diagram For the test dataset performance
2 two-layer neural network model shows much improvement keeps the same. K-means still has
better performance than k-means algorithm slightly better accuracy than selected values,
with the same number of clusters. This but combined approach shows slightly worse
performance difference is more for smaller accuracy results. In the worst case, it is 2.4% for
selected accuracy and in average for all tested 40% accuracy. This can be explained by neural
accuracy values it is 48.281 %. Also, we should network performance. They have not the
mention that the real accuracy is the same as perfect performance on testing data, and for
selected, so combined approach showed no some objects this error is more than final price
accuracy loss for learning dataset. Small range. But despite of this performance the mean
accuracy improvement of k-means algorithm range size is in average 52.103% better. So, we
for learning dataset can be explained by the fact can slightly increase target accuracy to keep the
that some clusters may have less than 20 final one on the same level with selected one
elements, so percentiles values are not accurate and combined approach will still show better
enough and they represent slightly incorrect performance than plain k-means algorithm. It
accuracy values (e.g. 61% percentile instead of also keeps showing more performance
60%). difference for smaller accuracy. As we see in
See the result on the Table 3 and diagram 3. tables 2 and 3 the results of two-layer neural
network considerably exceed corresponding
Two-layer NNM k-means Results
differen results of k-means algorithm. On the learning
Accurac Averag Accurac Averag
ce in dataset accuracy meets given values and price
y e range y e range
percent ranges are in average 48% smaller. On the test
0,89 646,95 0,902 859,576 32,866 dataset accuracy decreases slightly (by 1-3%),
0,783 450,593 0,803 638,929 41,797 but price ranges keep their size.
0,685 339,624 0,701 495,63 45,935
0,579 262,351 0,607 392,859 49,746 Conclusion
0,478 201,813 0,503 314,739 55,956 Artificial neural networks are an
0,376 153,383 0,406 243,603 58,819 alternative solution for the multiple regression
0,278 111,469 0,302 178,865 60,462 method used for real estate valuation.
0,171 71,344 0,198 116,831 63,756 New two-layer neural network model for
0,091 36,203 0,108 57,776 59,59
real estate valuation is developed in this article.
The main idea of this model is to combine
Average difference 52,103
clusterization by geographical coordinates and
Table 3. Price range assessment estimation with multilayer perceptron to
comparison for test dataset achieve the most effective usage of existing
objects characteristics for predicting its price.
The test result of the described model
shows that despite using much less clusters than
k-means it shows similar results for usual price
valuation, with the same number of clusters it [2]. Gonzlez, M., & Formoso, C. (2006). Mass
shows much better results for price range appraisal with genetic fuzzy rule-based
systems. Property Management, 20-30.
valuation. [3]. Helbrich, M., Brunauer, W., & Nijkamp, P. (2013).
The results of price range valuation of the Spatial heterogeneity in hedonic house price
described model suggest prospects for its models: the case of Austria. Urban Studies,
implementation in real-life systems for real 390-411.
estate market valuation. [4]. Lenk, M. M., Wozrala, E. M., & Silva, A. (1997).
High-tech valuation: should artificial neural
The accuracy of determination of real networks bypass the human valuer? Journal of
estate values using the ANN method is higher Property Valuation and Investment., 8-26.
than the accuracy obtained with the use of the [5]. Limsombunchai, V., Gan, C., & Lee, M. (2004).
multiple regression method for real estate House price prediction: hedonic price model
markets of high number of transactions above vs. artificial neural network. American Journal
of Applied Sciences, 193-201.
100 transactions. In such a case utilisation of [6]. Mark, J., & Goldberg, M. (1988). Multiple
the ANN is highly recommended. On the other regression analysis and mass assessment: A
hand, for the markets of medium number of review of the issues. Appraisal Journal, 56(1),
transactions several dozens of transactions 89-109.
tilisation of the ANN for real estate valuation [7]. Peterson, S., & Flanagan, A. (2009). Neural
network hedonic pricing models in mass real
results in the slightly higher accuracy of estate appraisal. Journal of Real Estate
estimation of real estate values. Those methods Research., 147-164.
may be interchangeably applied. [8]. Selim, S. (2011). Determinants of house prices in
Turkey: a hedonic regression model. Dou
niversitesi Dergisi., 65-86.
Acknowledgment [9]. Zurada, J., Levitan, A., & Guan, J. (2011). A
comparison of regression and artificial
intelligence methods in a mass appraisal
This work is partly supported by the context. Journal of Real Estate Research, 349-
project SIP-2017-09, /02/03/2017 - "Research 387.
and analysis of ecosystem monitoring and [10]. Panayotova G., G. P. Dimitrov, Design of Web -
management systems from the Internet of Based Information System for Optimization of
Things". Portfolio, The 13th International Synposium on
Operations Reseach in Slowenia, 23-25
September, 2015, Bled, Slowenia, pp. 193-198,
References ISBN978-961-6165-45-7
[11]. Panayotova, G. (2014) Mathematical modelling,
[1]. Bourassa, S., Cantoni, E., & Hoesli, M. (2010). Sofia, Bulgaria, ISBN 976-619-185-037-2
Predicting house prices with spatial
dependence: A comparison of alternative
methods. Journal of Real Estate Research,
139-160.
stochastic modelling parameters, such as covariance matrices, However, this filter can be very costly to implement, as a very
in order to deal with model approximations and bias on the large number of particles is usually needed, especially in high
predicted pose. In order to compensate such error sources, local dimensional system. In case of low dynamical noise, we
iterations, adaptive models and covariance intersection filtering observe that in multiplying the high weighted particles, the
have been proposed [16-20]. An interesting approach solution prediction step will explore poorly the state space. The particle
was proposed in [17], where observation of the pose clouds will concentrate on few points of the state space. This
corrections is used for updating of the covariance matrices. phenomenon is called particle degeneracy, and causes the
However, this approach seems to be vulnerable to significant divergence of the filter.
geometric inconsistencies of the world models, since
inconsistent information can influence the estimated covariance Despite the research efforts to improve filters performance
matrices. for data fusion, their behaviors remain unstable for some
applications such as navigation and localization
In the localization problem is often formulated by using a
unique model, from both state and observation processes point III. PROPOSED KPFK FILTER APPROACH
of view. Such an approach, introduces inevitably modelling
The Kalman-Particle Kernel Filter (KPKF) combines both
errors, which degrade filtering performances, particularly,
an EKF and a PF for a robust localization system by adjusting
when signal-to-noise ratio is low and noise variances have been
the state of mobile system and reducing the estimation error.
poor estimated. Moreover, to optimize the observation process,
This new filter is a kind of hybrid particle filter. It is based on
it is important to characterize each external sensor not only
the representation of the kernel of conditional density and on a
from statistic parameters estimation point of view but also from
local linearization as a Gaussian mixture [24]. The KPKF
robustness of observation process point of view. It is then
filter method can be implemened according to three steps, as it
interesting to introduce an adequate model for each observation
is shown in Figure 3:
area in order to reject unreliable readings. In the same manner,
a wrong observation leads to a wrong estimation of the state
vector and consequently degrades localization algorithm
performance.
Particle Filter (PF) based-methods are considered as a
sequential version of the Monte Carlo methods [21-23]. They
represent the most effective methods for nonlinear locatisation
of mobile systems. These methods have the ability to manage
a set of particles in order to determine positions, and
orientations. The principle of PF is to make the particles
evolving in the same way as the robot to determine new
positions and then comparing its perceptions to those of the
particles. We retrieve the model values odometry (prediction)
between two successive moments then transmitted to the filter
function for correction by the observation model. After a small
number of iterations, this process converges into a position Fig. 3. Kalman-Particle Kernel Filter diagram (KPKF).
where a population of particles is very dense. The PF method
is illustrated in Fig. 2. Correction step: is divided into two steps: a Kalman
correction and a particle correction. The correction
step ensures a mixture of Gaussian distribution of the
filter density in order to increase the probability of
the presence of the particles in the state space.
Prediction step: therefore, this step is still a mixture of
Gaussian distribution. In fact, the predicted density is
modeled in the same way as the corrected density.
Resampling step: is introduced to further reduce the
divergence of the particulate filter (Monte Carlo).
IV. IMPLEMENTATION
We present an application of the Kalman-Particle Kernel
Filter (KPKF) approach for a robust localization adapted to
disabled and elderly people. Our approach is implemented and
Fig. 2. Particle filter diagram (PF).
[4] C. Harris, A. Bailley, T. Dodd, Multi-sensor data fusion in defense and [16] G.A. Borges, M.J. Aldon, Robustified estimation algorithms for mobile
aerospace, Journal of Royal Aerospace Society 162 (1015) (1998) 229- robot localization based geometrical environment maps, Robotics and
244. Autonomous Systems 45 (2003) 131-159.
[5] J.B. Gao, C.J. Harris, Some remarks on Kalman filters for the multi- [17] L. Kleeman, Optimal estimation of position and heading for mobile
sensor fusion, Journal of Information Fusion 3 (2002) 191-201. robots using ultrasonic beacons and dead-reckoning, in: Proceedings of
[6] C. Chui, G. Chen, Kalman filtering with real time applications, Springer the IEEE International Conference on Robotics and Automation, 1992,
Series in Information Sciences, Springer-Verlag, New-York 17 (1987) pp. 25822587.
23-24. [18] L. Jetto, S. Longhi, G. Venturini, Development and experimental
[7] K.O. Arras, N. Tomatis, B.T. Jensen, R. Siegwart, Multisensor on-the- validation of an adaptive Kalman filter for the localization of mobile
fly localization: precision and reliability for applications, Robotics and robots, IEEE Transactions on Robotics and Automation 15 (2) (1999)
Autonomous Systems 34 (2001) 131143. 219229.
[8] H. Wang, M. Kung, T. Lin, Multi-model adaptive Kalman filters design [19] S.J. Julier, J.K. Uhlmann, A non-divergent estimation algorithm in the
for maneuvering target tracking, International Journal of Systems presence of unknown correlations, in: Proceedings of the American
Sciences 25 (11) (1994) 2039-2046. Control Conference, 1997.
[9] S. Borthwick, M. Stevens, H. Durrant-Whyte, Position estimation and [20] X. Xu, S. Negahdaripour, Application of extended covariance
tracking using optical range data, in: Proceedings of the IEEE/RSJ intersection principle for mosaic-based optical positioning and
International Conference on Intelligent Robots and Systems, 1993, pp. navigation of underwater vehicles, in: Proceedings of the IEEE
21722177. International Conference on Robotics and Automation, 2001, pp. 2759
2766.
[10] J.A. Castellanos, J.D. Tards, Laser-based segmentation and localization
for a mobile robot, in: F.P.M. Jamshidi, P. Dauchez (Eds.), Robotics and [21] H. A. P. Blom and Y. Bar-Shalom, The interacting multiple model
Manufacturing: Recent Trends in Research and Applications, vol. 6, algorithm for systems with Markovian switching coefficients, IEEE
ASME Press, New York, 1996, pp. 101109. Trans. Automat. Contr., vol. 33, pp. 780783, Aug. 1988.
[11] M. Jenkin, E. Milios, P. Jasiobedzki, Global navigation for ARK, in: [22] X. R. Li, Engineers guide to variable-structure multiple-model
Proceedings of the IEEE/RSJ International Conference on Intelligent estimation for tracking, in Multitarget-Multisensor Tracking:
Robots and Systems, 1993, pp. 21652171. Applications and Advances, Y. Bar-Shalom and D.W. Blair, Eds.
Boston, MA: Artech House, 2000, vol. III, ch. 10, pp. 499567.
[12] P. Jensfelt, H.I. Christensen, Pose tracking using laser scanning and
minimalistic environment models, IEEE Transactions on Robotics and [23] X. R. Li, Hybrid estimation techniques, in Control and Dynamic
Automation 17 (2) (2001) 138147. Systems: Advances in Theory and Applications, C. T. Leondes, Ed. New
York: Academic, 1996, vol. 76, pp. 213287.
[13] J.J. Leonard, H.F. Durrant-Whyte, Mobile robot localization by tracking
geometric beacons, IEEE Transactions on Robotics and Automation 7 [24] X. R. Li and Y. Bar-Shalom, Design of an interacting multiple model
(3) (1991) 376382 algorithm for air traffic control tracking, IEEE Trans. Contr. Syst.
Technol., vol. 1, pp. 186194, Sept. 1993. (Special issue on Air Traffic
[14] J. Neira, J.D. Tards, J. Horn, G. Schmidt, Fusing range and intensity Control).
images for mobile robot localization, IEEE Transactions on Robotics
and Automation 15 (1) (1999) 7684. [25] Y. Touati, H. Aoudia, and A. Ali-Chrif, Intelligent Wheelchair
localization in wireless sensor network environment: A fuzzy logic
[15] J.A. Prez, J.A. Castellanos, J.M.M. Montiel, J. Neira, J.D. Tards, approach, 5 th IEEE International Conference on Intelligent Systems,
Continuous mobile robot localization: vision vs. laser, in: Proceedings of 2010, London, UK , pp.408-413.
the IEEE International Conference on Robotics and Automation, 1999,
pp. 29172923.
AbstractAttribute selection methods explore the interrela- to the smallest accuracy decrease of the model [11]. Another
tionship between the data to avoid less relevant attributes. Some Multilayer Perceptron solution is based on the signal-to-noise
selector methods are also able to estimate the significance rate ratio measurement between the first layer weights and noise
of input attributes. Removing unnecessary input attributes has
several advantages, like a lower variance and complexity of the injected input weights relating to the input attributes. The
machine learning model. In this paper, we propose a four-layer signal-to-noise ratio fluctuates around zero if the attribute
feedforward neural network, which estimates the input attribute is irrelevant [12]. Another neural network based approach
relevance rate depending on the desired output. The neural measures the model sensitivity by removing one attribute at
network contains a pre-input layer, where every input attribute time. The summed errors are calculated between the reduced
is connected by a salient weight to the next layer. Therefore,
every attribute primarily depends on its salient weight. Two models and the model with all attributes, where relevant
penalty terms are added related to the salient weights. Thus, attributes cause higher error [13].
the relevant and irrelevant attributes can be distinguished. The
attribute significance estimation capability of the proposed neural Specifically for classification tasks, an attribute selector
network was evaluated for three artificial and one real regression Multilayer Perceptron is introduced in [14]: output gradient
in addition to a real classification problem. based constraining terms are added to the cross-entropy error
cost function and the less salient attributes are eliminated
Index TermsFeature selection, attribute selection, dimension- automatically by setting them to zero one by one. The smallest
ality reduction, multilayer perceptron, artificial neural network,
feature significance, attribute saliency, feature ranking. drop attribute is eliminated at every training period. After the
elimination, the neural network is retrained and this process
I. I NTRODUCTION is repeated until only one input feature remains.
Attribute selection methods distinguish relevant data at- The Constructive Approach for Feature Selection algorithms
tributes from irrelevant ones. Thus, the variance and complex- is a wrapper based neural network with a self-growing struc-
ity of machine learning models can be decreased by using ture [15]: before the training, two groups of the features
only the relevant input attributes [1]. One subset of attribute are generated based on the correlation between them and
selectors are the filter methods, which rank the input attributes the weights of the neural network are initialized randomly.
before the learning phase, and remove the irrelevant attributes The first build-up of the neural network contains one or two
below a defined threshold. Pearson Correlation Coefficient [2] neurons in the input and hidden layer. During the training,
and Mutual Information [3] belong to this subset [4]. the neural network adds the features and hidden units by
Wrapper methods are other attribute selectors and they can predefined conditions till the neural network has not fulfilled
be divided into two subsets. One group is the sequential these conditions.
selection algorithms like Genetic Algorithm [5] and Swarm
The above detailed solutions focus mainly on classification
Optimization [6]; the other is the heuristic search methods
problems. This paper presents a four-layer feedforward neural
like Branch and Bound [7], which become computationally
network, which is tested also for regression tasks. Every input
complex by a growing number of attributes.
attribute is connected to the input layer through a salient
Unsupervised methods explore the relationships between
weight. Thus, the relevance of the attributes depends primarily
the attributes without any predefined classes. Such techniques
on their salient weight, which are regularized during the
are the clustering algorithms, like the k-means, where the
training.
clustering is based on the distance measurement [8] [9].
Embedded methods classify different subsets by applying Section II describes the build-up of the Attribute Signif-
greedy search algorithms to find the appropriate subset. Sup- icance Estimator Multilayer Perceptron, how it is trained
port Vector Machines [10] and artificial neural networks are and the attribute significance estimation process. Section III
part of this approach. The Neural-network Feature Selector ap- presents the result of three artificial regression tasks, one real
plies two penalty terms to eliminate the unnecessary weights. classification and one real regression task. The paper finishes
The irrelevant attributes are removed one by one according with a summary and conclusion.
|waij |
ASRj (wai ) = Na
100%. (5)
X
|waik |
k=1
Fig. 1. The Attribute Significance Estimator Multilayer Perceptron.
After the attribute significance ratio calculation, the at-
The evaluation steps of the neural network output are: an tributes are ranked.
element-wise vector multiplication of the attributes and the
salient weights delivers the output of the extra layer, which
III. R ESULTS
is fed into the input layer. Then the general forward pass
calculation evaluates the neural network output. In vector
A. Basic regression cases
form, the whole forward pass calculation is:
Six input attributes and three combinations of them are
x = x wai , (1) defined. These three combinations are the desired output
of the neural network. The Attribute Significance Estimator
h = fa (xT Wih + bh ), (2) Multilayer Perceptron has to determine the input attributes
saliency and in this case these three functions. Table 1 shows
the six defined attributes.
y = fa (hWho + bo ), (3)
where x denotes the input attributes, wai the salient weights TABLE 1
between the input attributes and input layer, the Hadamard T HE INPUT ATTRIBUTES
product, x the extra layer output, fa the activation function, h Attribute Name Function Range/Value
the hidden layer activation, Wih the weights between the input x1 sine [0, 3]
and hidden layer, bh the hidden layer biases, y the output layer x2 linear (x) [0, 1]
activation, Who the weights between the hidden and output x3 cosine [0, 3]
layer and bo the output layer biases. x4 exponential [0, 1]
B. Training and weight regularization x5 constant 2
x6 sine [0, 3]
The salient weights are part of the optimization map. Thus,
the significance of an attribute depends mainly on its salient
weight. By using penalty terms, the irrelevant attributes can be To test the redundancy elimination capability of the neural
diminished during the training. Two penalty terms are added network, x1 and x6 are identical. Therefore, x1 attribute ap-
to the error function regarding the salient weights: pears always in the desired function. Another consideration is
Na Na 2
to prove the Multilayer Perceptron robustness against constant
X
2
X waij value. Thus, x5 is never added to the desired function. Table
R(wai ) = 1 wai + 2 2 . (4)
j=1
j
j=1
1 + waij 2 contains the desired functions.
R EFERENCES
[1] G. Chandrashekar and F. Sahin, A survey on feature selection methods,
Computers and Electrical Engineering, vol. 40, no. 1, pp. 16 28, 2014.
[2] I. Guyon and A. Elisseeff, An introduction to variable and feature
selection, Journal of machine learning research, vol. 3, no. Mar, pp.
11571182, 2003.
[3] J. R. Vergara and P. A. Estevez, A review of feature selection methods
based on mutual information, Neural Computing and Applications,
vol. 24, no. 1, pp. 175186, 2014.
[4] R. Battiti, Using mutual information for selecting features in supervised
neural net learning, IEEE Transactions on neural networks, vol. 5, no. 4,
pp. 537550, 1994.
[5] D. E. Goldberg and J. H. Holland, Genetic algorithms and machine
learning, Machine learning, vol. 3, no. 2, pp. 9599, 1988.
Fig. 2. Mean squared errors relating to the number of attributes. [6] J. Kennedy and R. Eberhart, Particle swarm optimization proceedings
of the international conference on neural networks, Australia IEEE, vol.
1948, 1995.
For the heating load (y1 ) estimation, the attribute ranking [7] P. M. Narendra and K. Fukunaga, A branch and bound algorithm for
was correct because the error increases only when all attributes feature subset selection, IEEE Transactions on Computers, vol. 26,
are used. So, overfitting appeares by using also x3 input. In no. 9, pp. 917922, 1977.
[8] C. M. Bishop, Pattern recognition, Machine Learning, vol. 128, pp.
the case of cooling load (y2 ), the error does not monotonically 158, 2006.
decrease. Despite, an error increase happens only once with [9] A. Likas, N. Vlassis, and J. J. Verbeek, The global k-means clustering
the first four most relevant attributes and the MSE increased algorithm, Pattern recognition, vol. 36, no. 2, pp. 451461, 2003.
by 0.74. [10] J. Neumann, C. Schnorr, and G. Steidl, Combined svm-based feature
selection and classification, Machine learning, vol. 61, no. 1, pp. 129
150, 2005.
IV. S UMMARY AND C ONCLUSION
[11] R. Setiono and H. Liu, Neural-network feature selector, IEEE trans-
Attribute selection is a machine learning and data mining actions on neural networks, vol. 8, no. 3, pp. 654662, 1997.
task. Attribute selectors explore the interrelationship between [12] K. W. Bauer, S. G. Alsing, and K. A. Greene, Feature screening using
signal-to-noise ratios, Neurocomputing, vol. 31, no. 1, pp. 2944, 2000.
the data and reduce the size and variance of machine learning [13] K. De Rajat, N. R. Pal, and S. K. Pal, Feature analysis: Neural network
models. and fuzzy set theoretic approaches, Pattern Recognition, vol. 30, no. 10,
This paper has presented a four-layer feed-forward neural pp. 15791590, 1997.
[14] A. Verikas and M. Bacauskiene, Feature selection with neural net-
network approach, which can estimate the attribute signifi- works, Pattern Recognition Letters, vol. 23, no. 11, pp. 13231335,
cance rate related to a given output. Every input attribute is 2002.
KEYWORDS
2.3 SHOT Descriptor Extracting (a). In the model (b). In the scene
To extract feature points, the proposed Figure 6. Overviews of matched points
method uses SHOT (Signature of Histogram Table 2. The LIST in the model
of Orientations). The surface features of the
three-dimensional model can be described
with unique and repeatability by using SHOT.
It expresses the relationship between the point
of interest and its surroundings by histograms.
Since SHOT descriptor is expressed in 352
dimensions, SHOT is the method that can
extract feature points with high accuracy. In Table 3. The LIST in the scene
this section, we explain about how to extract a
SHOT descriptor. To extract a SHOT
descriptor, we use an isotropic spherical grid
that encompasses partitions along the radial,
azimuth and elevation axes, as sketched in
Figure.1. Since each volume of the grid
research and SHOT in occlusion scene. In Table 5. The result of average processing time
addition, as shown in Table 4, a processing Average processing time[sec.]
Spray Pack Noodle
time of the proposed method was nearly equal Previous research 3.20 2.75 2.62
to the SHOT and the previous research. From SHOT 1.24 1.09 1.12
these results, we consider that the proposed Proposed method 1.82 0.91 1.24
method has the robustness against occlusions
because the proposed method is able to match
the feature points of the target object and the
feature points of unoccluded scene data by
using the SHOT. In addition, we consider that
the proposed method is able to estimate a
pose of a target object with high accuracy
because the proposed method uses is not only Figure 13. The result of the spray of the pose
the corresponding points but also estimation in a noisy environment
relationships between corresponding points.
From these results, we consider that the
In this paper, I only show the result of the
proposed method has the robustness against
spray, however we got the same result in
noises, because the proposed method uses
other objects.
SHOT to generate feature points, SHOT is
hardly interfered with noises due to the high
3.2 Recognizing objects in a noisy dimensionality of SHOT feature quantities.
environment
In this section, to evaluate effectiveness of the
3.3 The Experiment in Recognition of
proposed method in recognizing objects in a
Objects Which Have the Same Feature
noisy environment. We prepared same objects
but Which are not the Same Object
with the first experiment. To generate noisy
To qualitatively evaluate about coping with
scenes, we added some Gaussian noise on the erroneous in the proposed method, we
scenes. Figure 12 shows the changing of the compared the proposed method with the
scene data when we added noises. SHOT. As target objects which have the same
To evaluate a pose estimation accuracy of a feature in a local region but which are not the
target object, we use the corresponding rate M same object, we prepared a 500ml-pack and a
same as section 3.1. 1000ml-pack as shown in Figure 14.
Figure 13 and Table 5 show results about We generated the teaching data from the
accuracy and processing time of the proposed 1000ml-pack and applied the proposed
method, SHOT and the previous research. As method, the SHOT to a scene data in 500ml-
shown the result, we consider that the pack. Figure 15 and Figure 16 show the
accuracy and processing time of the proposed
results of the SHOT, Figure 17 shows the
method are equal to or more than these of the
result of the proposed method. As shown in
previous research and SHOT. Although I
these results, erroneous correspondence
show only the result of the spray here, the
occurred in the SHOT and it misrecognized
pack and the Cup noodle were able to obtain
the 1000ml-pack as the 500ml-pack.
equivalent result.
(a). 0.001[] (b). 0.002[] (c). 0.003[] (d). 0.005[] (a). 500-ml pack (b). 1000-ml pack
Figure 14. Overviews of objects and 3-
Figure 12. The result of the spray occluded from
dimensional data of objects in the
3 directions
experiment
[5] S.Maehara,H.imamura,K.Ikeshiro, A 3-
4 CONCLUSION Dimensional Object Recognition Method Using
SHOT and Relationship of Distances and Angles
in Feature Points.,DIPDMWC2015,2015
In this paper, we proposed the 3-dimensional
object recognition method which has five
properties as follows using relationships of
Object Detection Method Using Invariant Feature Based on Local Hue Histogram in
Divided Area of an Object Area
ABSTRACT
In recent years, human support robots have been
receiving attention. Then, the robots are required to
perform various tasks to support humans. Especially, an
object detection task is important in case that people
request the robot to transport and rearrange an object. (a) (b)
However, the detection becomes difficult owing to Figure. 1. A target object and An input image. a) A target
difference of visual aspect in case of detecting with a object. b) An input image
target object from an equipped camera on a robot. We
consider that there are seven necessary properties to
The object detection is technology to detect a
detect in domestic environment as follows. 1. target object (Fig.1 (a)) from an input image (Fig.1
Robustness against the rotation change. 2. Robustness (b)).
against the scale change. 3. Robustness against the There is a problem which the detection is difficult
illumination change. 4. Robustness against the distortion in case of detecting with a target object from an
by perspective projection. 5. Robustness against the equipped camera on a robot, because differences of
occlusion. 6. Detecting an object which has few textures. visual appearance occur such as the rotation change.
7. Detecting different objects which have the same Therefore, we consider that there are six necessary
features of the hue histogram. As conventional method, properties to detect in domestic environment as
there are Scale Invariant Feature Transform (SIFT), follows.
Color Indexing and the proposed method by Tanaka et al.
These conventional methods do not satisfy all seven 1. Robustness against the rotation change
properties needed for the robots. Therefore, to satisfy the 2. Robustness against the scale change
seven properties of detection, we propose the object
detection method using invariant feature based on local 3. Robustness against the illumination change
hue histogram in a divided area of an object area. 4. Robustness against the distortion by perspective
KEYWORDS projection
Cognitive robot, hue histogram, peak and trough, 5. Robustness against the occlusion
divided area. 6. Detecting an object which has few textures
1 INTRODUCTION Firstly, the robots need the robust detection for the
In recent years, human support robots have been rotation change because the rotation change is
receiving attention [1-3]. Then, the robots are occurred in case that an object falls down such as
required to perform various tasks to support humans. fig.1 (b). Secondly, the robots need the robust
Especially, an object detection task is important in detection for the scale change because the scale
case that people request the robots to transport and change is occurred by the position between the
rearrange an object. robots and an object. Thirdly, the robots need the
robust detection for the illumination change because
(a)
(b)
Figure. 13. Smoothing processing. a) A part of hue
Figure. 12. Divided area and local hue histograms.
histogram. b) A part of smooth hue histogram
{1 () , 2 () , } () (6)
where () is a set of peak position of a smoothed
hue histogram in a divided area . 1 () , 2 () ,
which are each peak positions of a smoothed hue
histogram, () is a set of trough position of a
smoothed hue histogram in a divided area .
1 () , 2 () , which are each trough positions of a
smoothed hue histogram. In addition, the extracted
positions of peak and trough of the input image are Figure. 14. Comparing invariant feature of registered
image with invariant feature of input image.
expressed by using
{1 () , 2 () , } () , (7) 1 , 2 , are the smallest difference values
between () to 1 () , 2 () , , 1 () , 2 () ,
{1 () , 2 () , } () , (8) which are each trough positions of a smoothed hue
where () is a set of peak position of a smoothed histogram in a divided area , is the number of
peak of a smoothed hue histogram in a divided area
hue histogram in a divided area , 1 () , 2 () ,
, is number of trough of a smoothed hue
which are each peak positions of a smoothed hue
histogram in a divided area , is the total value
histogram, () is a set of trough position of a of difference value of peak position, is the total
smoothed hue histogram in a divided area , value of difference value of trough position, is the
1 () , 2 () , which are each trough positions of total difference value. As an example, fig.14 shows
a smoothed hue histogram. that the comparison of the hue histogram of a
2.7 Invariant feature comparison registered image and the hue histogram of an input
Firstly, the proposed method calculates image. As shown in fig.14, the proposed method
difference values between invariant feature of the compares the positions of peak and trough of a
registered image and invariant feature of the input smoothed hue histogram of a registered image with
image by using the positions of peak and trough of a smoothed hue
histogram of an input image, and calculates
1 = min |1 () () |, (9) difference values. In addition, the proposed method
1
registers a peak and a trough having the smallest
2 = min |2 () () |, (10) difference value as the nearest peak and trough.
1
Finally, the proposed method detects the object
which has smallest D.
1 = min |1 () () |, (11) 3 EXPERIMENT
1
ratio decreased a little in case of 45 degrees of the the human support robot. To show that the proposed
distortion by perspective projection, because the method satisfies all seven properties as follows,
proposed method uses local feature amounts on
candidate object extraction and rotation processing. 1. Robustness against the rotation change
2. Robustness against the scale change
b) The conventional method: The correct
3. Robustness against the illumination change
answer ratio was less than 60% in all changes. 4. Robustness against the distortion by perspective
However, we see that there are few differences by projection
comparing the correct answer ratio in the object 5. Robustness against the occlusion
which a change is given and the correct answer ratio 6. Detecting an object which has few textures
in the object which a change is not given. Therefore, 7. Detecting different objects which have the same
we focus on the feature of the each object. As features of the hue histogram
shown in fig.22, the correct answer ratio was low in
the object having the same hue value and the We carried out experiments. As a result, we could
normal object. The reason for this, when the show that the proposed method satisfies all seven
candidate objects have plural pieces of color properties. However, the detection accuracy is
information, they have the same hue value with limited, because the proposed method uses only two
high probability, therefore it is thought that dimensional information. Therefore, in the future,
erroneous recognition occurred and the correct we aim to improve the ability for detection by using
answer ratio decreased. three dimensional information such as the shape
c) The SIFT: The correct answer ratio was less information.
than 60% in all changes. Especially, the correct
answer ratio decreased on the distortion by REFERENCES
perspective projection. The reason for this, the local [1] S. Sugano, T. Sugaiwa, and H. Iwata, Vision System for
Life Support Human-Symbiotic-Robot, The Robotics
feature amount changes by the distortion by Society of Japan, vol. 27, pp. 596599, 2009.
perspective projection, therefore it is thought that [2] T. Odashima, M. Onishi, K. Thara, T. Mukai, S. Hirano,
Z. W. Luo, and S. Hosoe, Development and evaluation
the SIFT descriptor changes and the correct answer of a human-interactive robot platform RI-MAN, The
ratio decreased. In addition, we focus on the feature Robotics Society of Japan, vol. 25, pp.554565, 2007.
of the each object. As shown in fig.22, the correct [3] Y. jia, H. wang, P. Sturmer, and N. Xi, Human/robot
interaction for human support system by using a mobile
answer ratio was low in the object which has few manipulator, ROBIO, pp. 190195, 2010.
textures. The reason for this, the SIFT descriptor [4] H. Fujiyoshi, Gradient-Based Feature Extraction SIFT
and HOG-, IPSJ SIG Technical Report CVIMI160,
uses the gradient information of the object, pp.221224, 2007.
therefore it is thought that the SIFT could not detect [5] M. j. Swain, and D. H. Ballard, Color Indexing, IJCV,
vol. 7, pp.1132, 1991.
the feature amount on the object which has few
[6] K. Tanaka, Y. Hagiwara, and H. Imamura, Object
edges such as the object which has few textures, Dtection in Image Using Feature of Invariant based on
and the correct answer ratio decreased. Histogram of Hue, IEICE, pp.187194, 2011.
[7] Leutenegger, S., Chli, M., & Siegwart, R. Y. (2011,
d) The Color Indexing: The correct answer ratio November). BRISK: Binary robust invariant scalable
decreased on the illumination change. The reason keypoints. In Computer Vision (ICCV), 2011 IEEE
International Conference on (pp. 2548-2555). IEEE.
for this, the RGB color system which is used for the [8] http://aloi.science.uva.nl/
Color Indexing is easy to be affected by the
illumination change, therefore it is thought that the
value of the three dimensional color histogram
changed and the correct answer ratio decreased.
4 CONCLUSION
In this study, we proposed the object detection
method using invariant feature based on the local
hue histogram in a divided area of an object area for
The batched mode shows the best accuracy In this work, two machine learning based
because of use of the documents context. The solutions for real-time and batch processing
analysis of already completed documents modes are proposed.
should be performed with the highest possible
precision. Taking into account the 3 PROPOSED SOLUTIONS
requirements for text recognition engine,
requirements for DSA are stated as: precision 3.1 Model architecture
by text class of 98%, and recall of 99%.
The proposed solution was validated using Due to the sequential nature of stream of
benchmark IAMonDo dataset [1] and Mobile strokes, the recurrent neural networks (RNN)
HandWriting Document (MHWD) dataset. provide high performance for online
The structure of the paper is as follows: after handwriting data analysis. Moreover, the
this introduction, section 2 presents the gated recurrent neural networks learn routing
review of published research. Section 3 activations with gating mechanism that
describes the proposed approach and feature controls update and reset actions in the
design. The experimental validation is recurrent unit, thus allowing the system to
presented in section 4. The conclusion is learn long-term dependencies. The gated
given at the end of the paper. recurrent unit (GRU) has been recently
presented in [12]. Compared to LSTM [13],
2 PUBLISHED RESEARCH [14] GRU shows faster converging during
training in terms of wall clock time and
The issues of automatic processing and number of iterations [15]. The GRU is also
recognition of handwritten input has been faster in terms of prediction time because it
discussed in scientific literature over the last has fewer parameters. The GRU model is
decade and the problem of classification described by the following equations [12]:
hand-drawn data to text/non-text objects is
one of the main issues that creates a basis for
successful text recognition when solved.
Particularly, Delaye et al. [8] proposed a (1)
context based text/non-text classifier which
establishes current state-of-the-art
performance for free form handwriting
documents. It is based on the conditional where xt is the input vector at the time t;
random fields and separates text blocks from rt and ut are reset and update gate vectors;
diagrams and tables. Recognition rates ct is cell output vector; ht is hidden states
achieved on IAMonDo dataset [8] are the vector; W denote weight matrices and b are
following: 99.58% for text blocks; 98.95% for bias vectors. The activation functions r and
tables; 88.88% for diagrams. u are sigmoid, and c is hyperbolic tangent.
Van Phan et al. [9] proposed a novel method The proposed artificial neural network
for text/non-text classification in online architectures for real-time and batched
handwritten document based on long short- approaches are presented in Fig. 3 and Fig. 4.
term memory (LSTM) recurrent neural We apply polygonal approximation [16] on
network. It shows classification rate of the strokes raw data points to robustly reduce
97.68% on IAMonDo dataset. The strokes the number of stroke points and improve
grouping into shapes and text-line grouping system quality.
approaches are analyzed in [10] and [11] The input layer receives normalized feature
respectively. Nevertheless, the complete vectors. Dropout layers (p = 0.5) are active
problem containing classification and only during training and included to improve
segmentation tasks is rarely analyzed in the the network generalization performance. The
published works. class conditional probabilities are obtained at
Feature Description
Difference between current and squared normalized sum of
past points along X coordinate trace return
Difference between current and
combination
past points along Y coordinate
Signature of the beginning of the combination
stroke (1, otherwise 0)
MHWD dataset has been collected and The evaluation results of the real-time
labeled with specially developed tools: approach on IAMonDo dataset are given in
sampling tool an Android application table 4. The accuracy of the new group
used by respondents who created hand- detection is essential for the object grouping
drawn documents on smartphones and algorithm (97%). Taking into account that the
tablets with different characteristics classifier operates only the limited amount of
(first of all with different screen sizes); contexts, the text classification precision
labeling tool a desktop application (91%) and recall (92%) are quite high.
that operates with documents created in
the sampling tool and supports labeling Table 4. Evaluation results for Real-time approach of
and proofreading with subsequent stroke grouping and group classification.
correction of found labeling errors;
validation and conversion tool a Object grouping
console application that allows to Class Precision Recall Support
convert samples from InkML to Same
0.76 0.90 513378
different formats, including graphics group
file formats (JPEG and PNG), as well as New
0.97 0.91 1676913
to check some aspects of labeling group
consistency automatically. Avg./total 0.92 0.91 2190291
Manual labeling defined a consistent Group classification
correspondence between each stroke of the Class Precision Recall Support
document and the reference value. Thus, each Text 0.91 0.92 414165
item in the hierarchical document structure Non-text 0.67 0.64 104028
shown in Fig. 5 is associated with at least one Avg./total 0.86 0.87 518193
stroke, and a stroke can belong only to one
item. The batched approach has been tested on
IAMonDo and MHWDSA datasets (table 5).
4.2. Results evaluation The classification of text strokes has accuracy
of 98% and recall of 99%.
The results evaluation has been performed on
the writer-independent test sets ensuring that Table 5. Evaluation results for Batched approach of
the learning system did not see samples from stroke classification.
the test set during training.
Dataset Class Precision Recall Support
Text 0.98 0.99 42205
IAMonDo Non-text 0.96 0.94 13312
dataset Avg./
total
0.97 0.97 55517
Text 0.981 0.991 17998
MHWD Non-text 0.948 0.898 3425
dataset Avg./
total
0.965 0.945 21423
Figure 7. Example of inaccuracies in the classification of formulae: a) real-time approach; b) batched approach.
[2] R. Plamondon and S. N. Srihari, Online and off- [5] S-Pen SDK 2.3 Tutorial: Technical Docs [Online].
line handwriting recognition: a comprehensive Available: http://developer.samsung.com/s-pen-
survey, IEEE Transactions on Pattern Analysis sdk/technical-docs/S-Pen-SDK-2-3-Tutorial
and Machine Intelligence, vol. 22, no. 1, pp. 63-
84, 2000.
[6] E. Indermhle, Analysis of Digital Ink in
Electronic Documents, Doctoral dissertation,
University of Bern, 2012.
[7] T. A. Tran, I. S. Na and S. H. Kim, Page [14] S. Otte, D. Krechel, M. Liwicki and A. Dengel,
segmentation using minimum homogeneity Local feature based online mode detection with
algorithm and adaptive mathematical recurrent neural networks, in Frontiers in
morphology, International Journal on Document Handwriting Recognition (ICFHR), IEEE
Analysis and Recognition (IJDAR), pp. 1-19, International Conference, 2012.
2016.
[15] K. Cho, B. Van Merrinboer, C. Gulcehre, D.
[8] A. Delaye and C.-L. Liu, Text/non-text Bahdanau, F. Bougares, H. Schwenk and
classification in online handwritten documents Y. Bengio, Learning phrase representations using
with conditional random fields, Pattern RNN encoder-decoder for statistical machine
Recognition, pp. 514-521, 2012. translation, arXiv preprint arXiv:1406.1078,
2014.
[9] T. Van Phan and M. Nakagawa, Text/non-text
classification in onlinehandwritten documents with [16] U. Ramer. An iterative procedure for the
recurrent neural networks, in Frontiersin polygonal approximation of plane curves.
Handwriting Recognition (ICFHR), 2014 14th Computer Graphics and Image Processing, vol. 1,
International Conference, 2014. no. 3, pp. 244-256.
[10] E. J. Peterson, T. F. Stahovich, E. Doi and C. [17] M. D. Zeiler, ADADELTA: An adaptive learning
Alvarado, Grouping Strokes into Shapes in Hand- rate method, arXiv preprint arXiv:1212.5701,
Drawn Diagrams, in AAAI Conference on 2012.
Artificial Intelligence, 2010.
[18] M. Liwicki and H. Bunke, Feature selection for
[11] X.-D. Zhou, D.-H. Wang and C.-L. Liu, A robust on-line handwriting recognition of whiteboard
approach to text line grouping in online notes, in Proceedings of the Conference of the
handwritten Japanese documents, Pattern Graphonomics Society, 2007.
Recognition, vol. 42, no. 9, pp. 2077-2088.
[19] I. Siddiqi and N. Vincent, A set of chain code
[12] K. Cho, V. M. Bart, D. Bahdanau and Y. Bengio, based features for writer recognition, in
On the Properties of Neural Machine Translation: Document Analysis and Recognition, 2009.
Encoder-Decoder Approaches, in Proceedings of ICDAR09. 10th International Conference, 2009.
SSST-8, Eighth Workshop on Syntax, Semantics
and Structure in Statistical Translation, 2014.
[20] Y.-M. Chee, K. Franke, M. Froumentin, S.
Madhvanath, J.-A. Magaa, G. Pakosz, G. Russell,
[13] E. Indermuhle, V. Frinken and H. Bunke, Mode M. Selvaraj, G. Seni, C.Tremblay and L. Yaeger,
detection in online handwritten documents using Ink Markup Language (InkML), W3C
BLSTM neural networks, in Frontiers in Recommendation, 20 September 2011. [Online].
Handwriting Recognition (ICFHR), IEEE Available: http://www.w3.org/TR/2011/REC-
International Conference, 2012. InkML-20110920/
Multi Feature Region Descriptor based Active Contour Model for Person
Tracking
algorithms were proposed in [16-17]. These section 6. Section 7 concludes our paper and
methods are based on different future perspectives are suggested.
complimentary features in order to get more
robust tacking results because the 2 LEVEL-SET BACKGROUND
performance of a single cue may degrade due
to complex nature of human appearance and The basic idea of Level-Set method [13] is to
environment challenges [4]. Hu et al [10] evolve an initial contour until reaching the
proposed to integrate many features into the boundaries of objects. The object contour is
level-Set method. It shows its effectiveness in described by an implicit representation where
many challenging situations but it consumes the contour is presented using a signed
much time and memory in order to evolve distance map. The deformed contour C is
each feature independently. considered as a zero level set function of a
Recently, region covariance descriptor higher-order function applied to image
is proposed in order to fuse multiple features spatial domain as illustrated by see figure 1
in a low dimensional representation [5, 12]. It and expressed by eq (1) where t is a point in
is able to capture not only each feature time and (x, y) is a point in space :
variation but also their correlation. It has been
proven to be a very efficient feature for visual C ( x , y ) = {( x , y ) / ( x , y , t ) = 0} (1)
tracking [6, 13, 14, 15] in several tracking
tasks and it outperforms histogram matching The level set function can be evolved by
method. The original Covariance-Tracker solving the partial differential equations
belongs to appearance based tracking (PDE):
approach. It estimates the new object position
by finding the covariance descriptor that has = F (2)
t
the minimum covariance distance to the with F is the speed function and is the
object model. It is a time consuming gradient operator.
algorithm. In addition, the monitoring result
depends on initialization. Indeed, if the model
incorporates information from the
background, the resulting model is not only
representative of the tracked object but also
influenced by the immediate environment of
the target.
Motivated by the convincing results of
active contour, the benefits of using
complementary features and the generic-like
Figure 1. The level set function.
object representation of covariance region
descriptor, we propose a new active contour Thus, C deforms iteratively according
based on multiple region features enrolled in to the speed function F until it attempts the
covariance region descriptor. border of the object through the minimization
The paper is organized as follows. of an energy computed based on different
Section 2 provides a brief mathematical criteria. The minimization process moves the
formulation of level-Set. Section 3 outlines points of the curve until it attempts the border
the proposed approach. In Section 4, the of the target object.
target appearance model using multiple The tracking results depend on the
features and covariance region descriptor are accurate choice of the speed function F. We
explained. Level-set based region evolution is have proposed a new level-set method based
detailed in section 5. The quantitative and on covariance region descriptor that will be
qualitative evaluations are presented in detailed on the next section.
LBPP, R = S(g g ) 2
i
4 TARGET REPRESENTATION i =0
i c
1, si x 0
Let I be a three dimensional color image. Let S ( x) = (4)
F be the W H d dimensional feature image 0, si x p 0
extracted from I, the covariance of a region is
computed as: where P is the number of the neighbours and
R is the radius of the central pixel. gc denotes
1
( f n )( f n ) T
N
Ci = (3) the gray value of the central pixel and gi
N 1 n =1
in
derivatives with respect to x and y and defined
H()dxdy (12)
as below:
(1H())(f(x, y)out()(f (x, y)out()) dxdy
T
Cout =
mag ( x , y ) = I X2 ( x , y ) + I Y2 ( x , y ) (8) (1H())dxdy
does not pick up the whole target accurately 6.2 Quantitative Evaluation
because it is based only on HSV color
histogram as appearance model. The level-set In order to evaluate quantitatively our
based edge (b) is based only on intensity algorithm, we have used the Percentage of
edges without any target information. It is not Correctly tracked Frames (PCF) over the total
able to detect correctly a multi-colored object. number of frames in the sequence. Tracking is
The third algorithm (c) get more precise considered to be correct if the overlap of the
contour but it depends on specific prior target bounding box of the tracking result and that
knowledge thats why it is inapplicable in of the ground truth is greater than 25% of the
complex scene with various shape area of the ground truth. The performances in
deformations. Our algorithm gives also terms of PCF are presented in Table 1.
perfect detection of the object based only on
region information without any prior Table 1. Information of the sequences and the tracking
knowledge. performances in terms of PCF
Figure 5. Tracking results for persons in PETS data set view 01. The tracking performance is correct even in presence of
occlusion
Figure 6. Tracking results for persons in PETS data set view 07. The tracking is robust even in presence of scale changes
Figure 7. Tracking results for elderly people in indoor dataSet. The tracking is robust despite the pose and shape deformations
Figure 8. Tracking results of the Walking woman sequence. From right to left are the results of particle filter [23], levelset
based edge and particle filter [241], supervised levelset based shape [25] and our method, respectively.
Figure 9. Tracking results of the Walking woman sequence. From top to bottom are the results of Covariance tracking [12],
levelset based histogram [9] and our method, respectively. Row 1: Frame #47. Row 2: Frame #114 . Row 3: Frame #217 Row
4: Frame #373 Row 5: Frame #447
ISBN: 978-1-941968-43-72017 SDIWC 56
Proceedings of The Fourth International Conference on Artificial Intelligence and Pattern Recognition (AIPR2017), Lodz, Poland, 2017
Approaches for optimization of WEB pages loading via analysis of the speed of
requests to the database
Georgi Petrov Dimitrov, PhD Galina Panayotova, PhD
University of Library Studies and University of Library Studies and
Information Technologies Information Technologies
Sofia, Bulgaria Sofia, Bulgaria
geo.p.dimitrov@gmail.com panayotovag@gmail.com
Iva Kostadinova
University of Library Studies and Information Technologies
Sofia, Bulgaria
kostova.iva@gmail.com
ABSTRACT I. INTRODUCTION
In the current article and analysis, we have Why do I think query optimization important
provided recommendations for decreasing first time for yourself as application developers?
the speed of web page loading. The biggest The fact is that when your users or your boss
reason for this is very frequently not the monitors the performance of the application
system overload or the poorly written code, you've done, they only see the page load speed. It
has to be known that the productivity of
but the slow execution of the SQL queries. As
applications is always important. How would you
a primary approach for optimization in this
feel if you hear the following: "We no longer
paper is viewed the possibility of SQL queries need this system because when we try to execute
optimization. The creation of fast and a query we have to wait 2-3 minutes, and we want
efficient queries, taking the minimal required that to happen right away."
quantity of data is required for achieving
good results. There are a lot of approaches for And you want to make sure that your
optimization but in this article, we have application works faster. You may find that in
attempted to analyze one of the most just 1-2 days you can optimize the performance
of your application so that it can begin to satisfy
frequently made mistakes using VIEW with
the user.
a big amount of columns.
This is because we did not pay attention to the
We have made recommendations for small details when we developed the app.
avoiding similar problems from the stage of When the Web Applications work slowly, the
web development in order to achieve a faster reasons are being sought in the code
result. optimization, caching, using a better hardware [7,
9,12]. When it comes to business applications,
KEYWORDS
which perform data exchange with RDBMS,
information system, database optimization, very frequently the reason for the slow work of
SQL, VIEW, query, Web Application, Business the applications is hidden in the SQL queries. In
application this case, there is a single piece of advice Try to
bring down to minimum the time for
execution of each query. [1,3]. One of the most
important skills for every web application
developer and database administrator is the
ability to create optimal SQL queries. In the first The sample algorithm for finding the
place, it is necessary to improve the efficiency of inefficient SQL queries and their optimization is
SQL queries [5,6]. Therefore developers and shown in chart 1.
database administrators must be able to
understand the mechanism of the plan for
executing queries and the techniques they can
apply for setting the queries [2].
MySQL has a powerful command line that
you can use to find out why your stitches work
slowly. This commands is EXPLAIN. EXPLAIN
can show you in detail what is actually going on
when you complete a batch. This way, you can
find the reasons for the slow execution of queries.
But this article is not for EXPLAIN.
The best way to learn to optimize the fast
execution is to attempt to write your queries in Chart 1: Algorithm for finding the inefficient
different ways and compare their time of SQL queries and their optimization is shown in
execution. figure 1.
In the current research, we have analyzed the Based on the conducted analysis we have
influence of the selected columns on the speed of found queries that slow down the page loading.
query executions in of the most popular The reasons for creating slow queries are
databases MySQL. The research is made with different, but in our research we have stopped on
the help of dbForge Studio for MySQL v. 7.2.53 the following: the use of queries that work with
. The analyzed tables have a different number of VIEW [8,11]. The conducted analysis shows that
rows and columns. one of the reasons for the slow execution in the
Our research is based on queries, included on MySQL environment is the mechanism for
48 pages, in an application developed for our SELECT execution based on VIEW. The reason
university. The application is developed using for this is that MySQL dont reason materialized
Microsoft Visual Studio 2015. view. The purpose of the view in MySQL is to
The measurement of the page loading speeds extend functionality by helping developers in
is made in the Visual Studio 2015 enterprise create the queries to create simple queries
environment. simple, who don't it affects server performance.
That is why, very often, when developers create
II. MAIN PART
a new project, they use them. Views in MySQL
Following the development of an application, are handled using one of two different
very frequently in the process of work after a algorithms: MERGE or TEMPTABLE. MERGE
certain amount of time, we note the fact that is simply a query expansion with appropriate
certain page loads slowly, which brings aliases. TEMPTABLE is just what it sounds like,
discomfort to the users work, i.e. the user the view puts the results into a temporary table
experience becomes worse. Conducting analysis before running the WHERE clause, and there are
on the reasons for increase of the loading time of no indexes on it. And when users start to work
individual pages is required in order to normalize with the new system (the new software product)
the applications work. Very frequently the and the data increases, performance slows
reason for this are slow requests. [7] dramatically. Additionally, developers are keen
to make View universal, that is, with the
maximum number of columns they might "need".
This is the wrong approach.
Even worse than that looking at the short and the time of execution is shown as an average.
table which just gets single row from the table by The queries, which are being executed are the
the key we think this is simple query, while can following:
be real monster instead with complexity hidden SELECT * FROM TABLE
away in VIEW definition.
You may even find yourself using cascading SELECT col_1 col_N FROM TABLE
views, not a single view. That is, you use a view SELECT col_1 col_30 FROM
that consists of several other views. And it may TABLE
be that when you run a view with a command of
"SELECT *" type, you get only 2-3 columns, SELECT col_1 col_20 FROM
TABLE
and in fact you have received 23, 30 and even
more than 100 columns with results that are SELECT col_1 col_10 FROM
invisible to you. TABLE
SELECT col_1 col_ FROM TABLE
The sample algorithm for the execution of the
view request includes in itself VIEW shown in
Chart 2. The sample queries are shown. We have
shown an example below of the first and last
request.
Example :
Code before optimization :
SELECT * FROM PRODUCTS;
Code after optimization :
SELECT
p. Product_ID,
Chart 2: Algorithm for execution of a single p.ProductNumber,
query with and without VIEW p. ProductCode,
p.ProductName,
The reason for the slower execution of the p.Price
queries, which include in itself VIEW, is most
frequently the developers effort to create FROM Products p
universal VIEW, which can be used in almost all In table 1 we have shown the results of
situations. We should not forget that in this case, execution of each query execution
we have to work with Of Course, sometimes we
Select Sele Sele
have to use VIEW, but even then it is good to Count all ct selec ct Sele
carefully plan the structure. colum Selec colum 30 t 20 10 ct 5
In order to show the impact of using the ns t* ns col. col. col. col.
97 14 14 5.5 4.5 3 3
queries with only the minimum number of
72 9 9 5.5 4.5 3 3
columns, we have developed research, the results 52 8.7 8.7 6 5 4 3
of which are shown in the current paper. 41 8 8 5 4 3 3
The measurements are made, like start 13 4.2 4.2 4.1 4.1 3 3
queries, based on the tables with different Table 1: Time of the queries execution in
number of columns and different number of milliseconds with different number of columns
records. The queries are executed multiple times
Chart 4 Time for execution of the queries 6.1 5.9 5.7 5.5
depending on the number of columns 5.2 5.1 5
5.3
4.5 4.5
It is obvious that the main reason for the
3.2 3.1 3.1 3.1
slower execution is not a command from the 3 3 3 3 3
15
Chart 5: Time for Query execution depending
10 3 on the number of rows
5 From the results shown, it is obvious that the
selected records have minimal impact on the time
0
Time for execution
for query execution.
In table 3, we have shown the results of the
Chart 3 Comparison of the time of execution measurements of the page loading time before
for minimum and maximum number of queries and after optimization of the queries code on
Meanwhile, we have conducted analysis on part of the projects pages.
the influence of the number of selected records
on the speed of the queries. The number of
selected records is in the 10000 to 400000 range.
The queries, which are being executed are the
same as the ones shown in the example above. In
table 2, we have shown the results according to
the number of queries.
satisfactory results [6]. The studies used Table 2 shows the information available for
different types of classifiers. Strength of the every text document. In subsequent analysis,
connection between texts and stock prices all these fields apart from author were used.
was evaluated by classification metrics (e.g.
by accuracy) which are based on how many Table 2. Available characteristics of a document with
times the classifier assigns correct class to the a concrete example of a discussion post regarding
company CETV.
given text.
Field original in Czech translated to
name English
3 DATA AND METHODOLOGY 2017-05-18 2017-05-18
datetime
11:49:00 11:49:00
The goal of the work was to examine the author mmmm mmmm
connection between content of text documents Za vodou koncila na Offshore price
title
published on the Internet and direction of 94. ended on 94.
ja si myslim ze se I think that it will
stock price movements, by using dostane nekam k 85 get to 85, but I
classification. A suitable approach had to be text
ale nemam kouli dont have crystal
taken for working with every aspect of this samozrejme. :)) ball of course. :))
task: handling prices and texts and processing
the data via classification algorithms. For every discussion post (Akcie.cz), it was
known to which company on the stock
3.1 Stock prices exchange it belongs. However, for news
articles (Patria.cz) this information was
For the main part of the work, we used the PX unknown. Moreover, it was found out, that a
index which reflects all companies traded on news article usually comments on multiple
the Prague Stock Exchange (BCPP). The data companies.
were downloaded from the stock exchanges
website (https://www.pse.cz). For every 3.3 Classification methodology
trading day, we used the closing value of the
index. We also decided to examine discussion We used classification to predict, whether a
posts for one company (CEZ). Because BCPP stock price will move up, down or stay
contains data only since 2012, we constant on the basis of documents text. Each
downloaded it from www.akcie.cz. price movement represented a class. To obtain
more diverse and possibly better results, we
3.2 Text data used both two (only up and down) and three
classes for classification. It was expected that
The examined text data (documents) were the ternary classification would perform
downloaded from two sources (see Table 1). worse, like mentioned in [7]. We extracted
All documents were written in Czech documents' features from the text by using the
language. bag-of-words model. Every document was
represented by a vector with values
Table 1. Examined text data. corresponding to the assigned weights of the
Source Documents Number Period Average words present in the document. For the
type of doc. per day
experiments with all discussion posts
Patria.cz News 1 244 9. 2. 16 2.63
articles to 27. (Akcie.cz) and news articles (Patria) values of
about 5. 17 the PX index were used. For one experiment
Czech (15 (referred to as CEZ experiment) stock
stock mon.) prices and discussion posts related to only one
market.
company (CEZ) were used.
Akcie.cz Discussion 20 605 14. 3. 6.13
posts about 08 to
17 Czech 27. 5. Document class. Assigning a class to a
companies. 17 (9y.) document was based on the relative price
change between two moments and on the Classification. Converted data were
threshold value (v) of minimal percentage processed again by scikit-learn. The data were
price change. Formally, the percentage price split into training (60%) and testing (40%)
return R in time t is: datasets. Class balancing was not performed.
Each of the generated vector representations
Rt = (pt pt-1) / pt-1, (1) was processed by 20 different classifiers (with
default settings we did not optimize the
where price pt-1 is the closing price of the day parameters of the classifiers). The
when the document was published (or the last performance of a classifier was rated by the
working day) and price pt is close price the achieved accuracy (proportion of correctly
closing price of next working day. If the price classified instances on all examined instances
return was in the constant interval (v, +v), [9, p. 268]) on the test set.
the document was either discarded from
further processing (for binary classification) 4 RESULTS AND DISCUSSION
or assigned the constant class label (for
ternary classification). If the price return was Three different sets of text data, all discussion
equal to or larger than +v, the document was posts (Akcie.cz), posts related to the CEZ
labeled as plus, otherwise as minus. We company, and news articles (Patria) together
used 0.25, 0.5 and 1.0% as the threshold with information about stock prices were used
values. to prepare data for classification. Based on the
combination of variable experimental
Text pre-processing and conversion. The parameters the number of classes (2 or 3),
text was processed as follows: minimal percentage change (0.25, 0.5 and
1. Join document title and text into one string. 1%) and weighting scheme for the term-
2. Strip diacritics from text (convert special document matrix (TP, TF, TF-IDF) 54
Czech letters to their ASCII equivalents). different sets were created and subsequently
3. Strip all HTML tags. processed by 20 classification algorithms.
4. Lowercase and remove punctuation.
5. Tokenize get words (using In total 1080 classification results were
TreebankWordTokenizer). obtained. We evaluated the results for each
6. Filter words minimal length of 3 letters, data set separately and for each classification
exclude numbers. set, the highest accuracy achieved by any
combination of vector type and classification
The edited text had to be converted into a algorithm was found. Our findings are
structured format. For this, a Python library presented in Table 3, Table 4 and Table 5.
called scikit-learn and its Vectorizer class Class 1 means minus, class 2 constant
were used. Only words which occurred at and class 3 plus.
least 5 times (for discussion posts) and 10
times (for news articles) in the whole
document collection were considered. Those
words were converted to a bag-of-words
representation using three different weighting
schemes [8, p. 2126]:
Term Presence (TP): 1 if a term is present in
a document, 0 if not.
Term Frequency (TF): how many times is a
term is present in a document.
TF-IDF: TF (local weight) multiplied by
IDF (global weight).
If we look at how balanced the datasets are, For discussion posts Extremely randomized
we can say that for 2 classes they are in all trees ensemble method was the best, closely
cases relatively well balanced. For 3 classes, followed by Multi-layer Perceptron neural
there is a clear misbalance. This can be seen network. However, out of the best five
especially in Table 6 where 82% of samples algorithms for news articles, only one
belongs to the constant class. (LogisticRegression) was successful also for
the posts. This indicates that for each type of
If we compare the classification accuracy for document different algorithms are suitable.
different datasets, we see that for discussion
posts it is far higher (+10%) than for news 5 CONCLUSION
articles. Interesting is that the accuracy
obtained by training a classifier on all The goal of the article was to examine the
discussion posts and the PX index (Table 3) is relationship between the content of text
higher than when using only posts and prices documents published on the Internet and the
for one company (Table 5). direction of movement of stock prices on the
Prague Stock Exchange. For this, text
The highest accuracy was achieved always for classification was used.
3 classes and 1% change. If we consider only
0.25 and 0.50% changes, accuracy for 2 The connection was found as demonstrated by
classes is always better than for 3 classes. the achieved classification accuracy. When
Generally, it can be said that the higher the using binary classification (documents with
percentage change, the higher the accuracy. constant class were discarded), we achieved
However, this does not hold for Patria news an accuracy of 75-78% for discussion posts
articles with 2 classes (Table 4), where is the and about 60% for news articles. For ternary
highest accuracy achieved for 0.25% change. classification, the accuracy was lower (about
65% and 40-50%). However, for all datasets
Table 6.Comparison of avg. accuracies for vector type was the accuracy, when using the highest 1%
Vector Akcie.cz CEZ Patria threshold for minimal price change, 80 %.
type
TP 0.67 0.63 0.51 During the work, we encountered several
TF 0.66 0.63 0.51 problems. The most notable one was a rather
TF-IDF 0.67 0.63 0.53 small amount of available data especially
the news articles.
Table 6 tells us that for discussion posts the
used vector type was not very important, It must be noted that the goal was to examine
however for news articles the highest if there is a connection between texts and
accuracy was achieved by TF-IDF. stock prices, not to achieve the highest
possible accuracy for each classification
Table 7. Comparison of avg. accuracies for
algorithm. Because of this, we used only
classification algorithms
Akcie.cz discussion posts Patria news articles
default settings (parameters values) for the
Algorithm Avg. Algorithm Avg. algorithms. An optimization of these
acc. acc. parameters might bring us a few percent
ExtraTrees 0.72 LogisticRegression 0.56 higher accuracy.
MLP 0.72 CalibratedClassifier 0.56 There are many options for further research in
RandomForest 0.71 SVC 0.56 this area: use clustering/topic models (e. g.
LogisticRegression 0.71 LogisticRegression 0.53 LDA) to find document classes based on their
LinearSVC 0.71 RidgeClassifier 0.53 content; use bigrams or tri-grams as features;
take into account the importance (popularity)
Table 7 shows classifiers with the best of the document, use more values for minimal
average performance across all experiments. price change and also other time interval
(more or less than 1 day).
REFERENCES
[1] SHILLER, R. J. From efficient markets theory to
behavioral finance. The Journal of Economic
Perspectives. 2003, vol. 17, no. 1, p. 83104.
AbstractThe paper addresses the problem of efficient goods more complex. One may argue that, nowadays, the general
distribution in logistic networks having a mesh structure. The availability of powerful computing machines creates new
transfer of goods takes place among the interconnected nodes opportunities for solving realistic optimization problems
with non-negligible delay. The stock gathered at the nodes is which until recently did not exist. However, performing
replenished from external sources as well as from other nodes in extensive numerical treatment becomes possible only when an
the controlled network. External demand is imposed on any node efficient method is selected, e.g., within the evolutionary
without prior knowledge about the requested quantity. The computation domain [9].
inventory control is realized through the application of order-up-
to policy implemented in a distributed way. The aim is to provide The purpose of this paper is to evaluate the usefulness of
high customer satisfaction while minimizing the total holding genetic algorithms (GAs) in the optimization of logistic
costs. In order to determine the optimal reference stock level for network performance when subjected to the control of the
the policy operation at the controlled nodes a continuous genetic classical order-up-to (OUT) [10] inventory policy. The
algorithm (GA) is applied and adjusted for the analyzed class of research is focused on a sophisticated, yet realistic case of a
application centered problems. system with mesh-type topology. In the analyzed structure
type, a particular node connected to multiple nodes may
Keywordslogistic networks, order-up-to policy, optimization, play the role of supplier and goods provider to effectuate the
continuous genetic algorithm.
stock replenishment decisions. The decisions are taken
I. INTRODUCTION according to the indications of the OUT policy, deployed in a
distributed way. The optimization objective is to determine the
The optimization of logistic network operation is a reference stock level for each individual node so that the
computationally challenging task. Owing to complex holding costs in the entire system are minimized while at the
mathematical dependencies and delayed interaction of system same time maintaining a given service level.
components (e.g., in a practical system the goods cannot be
transferred immediately among the nodes) makes the Since the considered problem has a continuous search
numerical analysis of multi-node networks resource domain, applying basic GA would require translating the
prohibitive. In particular, determination of the cost (or fitness) system variables (and associated operations) into the binary
function is time consuming. Moreover, the presence of form. Therefore, unlike the typical GA binary-value
nonlinearities may lead to many local minima. In the scientific implementation, one that resides in the continuous search
literature, the optimization of logistic systems is examined space is used. Moreover, as opposed to the standard GA
mainly in the case of basic structures, e.g., when each internal tuning procedures, proposed for the artificial optimization
node has only one goods supplier [1]. The most common types problems where multiple cost function evaluations are
of such structures are: permissible [11], the long time of obtaining the fitness
function value in the considered class of networked systems
single-echelon [2, 3] a single provider connected to shifts the GA tuning effort towards constrained number of
the controlled node; iterations. The effectiveness of GA in reaching the optimal
serial interconnection [4, 5] all the nodes connected network state is evaluated in numerous simulations.
one-by-one in a line; II. SYSTEM DESCRIPTION
tree-like organization [6][8] a particular node A. Actors in Logistic Processes
replenishes the stock of a few children.
The paper analyzes the process of goods distribution
These studies are not sufficient for current logistic among the nodes (warehouses, stores, etc.) of a logistic
systems, where the actually deployed architectures are much network. Each node has limited capacity to store the goods.
The nodes are connected in a direct manner and a mesh Let us introduce:
topology is permitted. Each connection is characterized by two
attributes: iS ( t ) quantity of goods sent by node i in period t,
delivery delay time (DDT) the time from issuing an iR ( t ) quantity of goods received by node i in
order for goods acquisition until their delivery to the period t.
ordering node;
The stock level at node i evolves according to
supplier fraction (SF) the percentage of ordered
quantity to be retrieved from a particular supply
li (t 1) li (t ) iR (t ) d i (t ) iS (t )
source selected by the ordering node.
Apart from the initial stock at the nodes, the main source
of goods in the network are the external suppliers. There are where ( f )+ denotes the saturation function ( f )+ = max{ f, 0}.
no isolated nodes that would not be linked to any other The satisfied external demand si(t) at node i in period t (the
controlled node or external supplier, neither the nodes that goods actually sold to the customers) may be expressed as
would supply the stock for themselves. In addition, there is a
finite path from each controlled node to at least one external si (t ) min li (t ) iR (t ), d i (t )
source, which means that the network is connected. The
system driving factor is the external customer demand
imposed on the controlled nodes. The demand can be placed at Consequently, (1) may be rewritten as
any node and, as in the majority of practical cases [10, 12], its
future value is not known at the moment of issuing an order. li (t 1) li (t ) iR (t ) si (t ) iS (t )
The business objective is to ensure high customer satisfaction
through fulfilling the external demand, at the same time Let oi(t) denote the total quantity of goods to be ordered by
avoiding unnecessary increase of the operational costs. Thus, node i in period t. oi(t) covers the orders to be realized both by
the optimization purpose is to obtain a high service level at the other controlled nodes as well as the external sources. Then,
lowest possible cost of goods storage at the nodes, i.e., the quantity sent by node i in period t in response to the orders
minimizing the total network holding cost (HC). from its neighbors
B. Actor Interaction
The considered logistic network consists of N nodes ni, iS (t )
j N
ij (t )o j (t )
where index i N = {1, 2, , N}, and M external sources mj,
where j M = {1, 2, , M}. The set containing all the
On the other hand, the quantity of goods received by node i in
indices = {1, 2, , N + M}. Let li(t) denote the on-hand
period t from all its suppliers
stock level (the quantity of goods currently stored) and di(t)
the external demand imposed on node i in period t, t = 0, 1, 2,
, T, T being the optimization time span. The connection iR (t ) ji (t ji )oi (t ji )
j
between two nodes i and j is unidirectional, characterized by
two attributes (ij, ij):
The nodes try to answer both the external and internal
ij the SF between nodes i and j, ij [0, 1]; demand. In case of insufficient stock to fulfill all the requests,
the ordered quantity is reduced accordingly, yet
ij the DDT between nodes i and j, ij [1, ], where
denotes the maximum DDT between any two
interconnected nodes. i 0 ji (t ) 1
j
Fig. 1 illustrates the operation sequence at a network node
occurring in each period.
When a node receives a request from another controlled node
in the network and is able to fulfill it, then ij(t) = ij.
Otherwise, ij(t) < ij. It is assumed that the external sources
are able to satisfy every order originating from the network
(uncapacitated external sources).
C. State-Space Description
For the purpose of convenience of further study, a network
state-space model will be introduced. The dynamic
Fig. 1. Node operational sequence.
dependencies can be grouped into
Detailed mathematical description of node interaction is
given in [12]. Below, only the fundamental issues required for l (t 1) l (t ) M (t )o(t ) M 0 (t )o(t ) s (t )
the algorithm implementation are covered. 1
l (t ) l1 (t ), l2 (t ),..., l N (t )
T
o(t ) o1 (t ), o2 (t ),..., oN (t )
T
Fig. 2. OUT policy operational sequence.
s ( t ) vector of satisfied (external) demands
According to [10], the quantity in the replenishment order
placed by node i in period t may be calculated as
s (t ) s1 (t ), s2 (t ),..., sN (t )
T
oi (t ) lir li (t ) i (t )
M (t) matrices specifying the node interconnections;
for each [1, ], where:
lir the reference stock level set at node i, i [1, N],
i1 (t ) 0 0 0
i : i 1 i (t ) the quantity of goods from pending orders
0 i 2 (t ) 0 0
i :i 2 issued by node i (the orders already placed by not yet
M (t ) 0 0 i 3 (t ) 0 realized due to delay).
i :i 3
In order to allow application of the OUT policy in a
0 0 0 iN (t ) distributed environment, which is considered explicitly in this
i :iN work, formula (13) needs to be converted into a vector form
M0 (t) matrix describing the stock depletion due to t 1
internal shipments o(t ) l r l (t ) M i ( )o( )
k 1 t k
0 12 (t ) 13 (t ) 1N (t )
(t ) where lr denotes the vector of reference stock levels.
21 0 23 (t ) 2 N (t )
A logistic network should retain a high service level
M (t ) 31 (t ) 32 (t ) 0 3 N (t ) despite imprecise knowledge about the demand future
evolution. The system performance is quantified through the
N 1 (t ) N 2 (t ) N 3 (t ) 0 fill rate, i.e., the percentage of actually realized customer
demand imposed on all the nodes. The optimization objective
is to indicate a reference stock level for each node so as to
D. Order-Up-To Inventory Policy preserve the lowest possible holding costs while keeping the
One of the popular stock replenishment strategies applied fill rate close to a predefined one ideally 100%. As a first
in logistic systems is the OUT inventory policy. This policy approximation, using only the knowledge about the highest
attempts to elevate the current stock level to a predefined expected demand in the system dmax, the 100% fill rate is
reference one. A replenishment order is issued if the sum of obtained if the reference stock level is selected according to
the on-hand stock level and goods quantity from pending the following formula
orders at a node is below the reference level. The reference
level should be set so that high percentage of the external
demand is satisfied. The network optimization procedures l r I N M M 1d max
discussed in this paper provide guidelines for the reference 1
stock level selection under uncertain demand (the future
demand is not known precisely while issuing the stock III. GENETIC ALGORITHM
replenishment orders). The operational sequence of the OUT
policy is presented in Fig. 2. In order to optimize the performance of the considered
class of logistic networks according to the objectives stated in
Section II, a continuous-domain GA has been implemented.
Let the vector containing the reference stock levels of all the
controlled nodes be a candidate solution (an individual) in the
population used by the GA. The genotypes of each individual
correspond to the phenotypes of reference stock levels. Since
HC
Fitness 1 FR
HCinitial
D. Crossover
The crossover operation is performed in the typical way
Fig. 3. Genetic algorithm flowchart.
for GAs. First, a uniformly distributed random number is
A. Initialization generated. Its value does not exceed the gene size in an
individual. Then, each pair from the previous selection is used
The initialization stage includes calculating the reference to form two new candidate solutions. More precisely, each
stock levels according to formula (15), i.e., under the individual from a particular pair is divided into two sub-
assumption that the system is faced with fixed external vectors and two child individuals are formed through
demand equal to its largest value throughout the entire swapping these sub-vectors. For two individuals
optimization time span. This setting allows one to determine A = [lA1, lA2, , lAN] and B = [lB1, lB2, , lBN] the crossover at a
the maximum holding cost HCmax as a boundary point for point determined by random number , [0, N], results in
further calculations. Although full customer satisfaction is
then obtained, the holding cost is high and need to be reduced. C1 = [lA1, lA2, , lA, lB(+1), , lBN],
C2 = [lB1, lB2, , lB, lA(+1), , lAN].
Table 1 groups the data regarding the fill rate and obtained
holding costs for different sets of fitness function shaping
coefficients. It allows one to assess the impact of cost
reduction vs. ensuring high customer satisfaction.
Fig. 6. Fitness adjustment progress.
It follows from the analysis of obtained data that even a
small change of the fitness function coefficients may have Figs. 7 and 8 display the stock level evolution at the nodes
significant impact on the cost structure and process of for the initial and final simulation. As can be seen from the
determining the optimal solution. Depending on the
graphs, the GA algorithm successfully eliminates superflous
circumstances in a given scenario, the relative importance of
resources (and reduces the holding costs) while keeping the
those factors can be balanced to achieve a desirable solution.
stock positive most of the time, which implies a high fill rate.
Increasing raises the importance of holding cost reduction in
the goods distribution process, while increasing guarantees a
higher fill rate (improved customer satisfaction). Simultaneous
increase of both coefficients leads to a state of near full
customer satisfaction attained with minimum holding costs.
Index Category Feature Extraction Classification Data Set (#Gq ,#NGr ) Reported Accuracy(%)
Acharya et al. [5] Appearance-baseda HOSc , GLCMd , RLMe SVMk , SMOl , In-house (30,30) 91.00%
NBm , RFn
Krishnan and Faust Appearance-based HOS, TTf , DWTg SVM In-house (30,30) 91.67%
[6]
Ali et al. [7] Appearance-based LBPh NNo HRFs +In-house (13,28)t 95.10%
Fondon et al. [8] Geometricb CDRi Thrp In-house (,) 78.10%
Guerre et al. [9] Geometric CDR, NRIMj SVM In-house (15,14) 89.00%
In-house (18,8) 71.00%
Dutta et al. [10] Geometric CDR Thr HRF (15,15) 90.00%
This may result in obtaining a high computa- nosis. For the geometric features, the cup-to-
tional cost. In addition, even though the ge- disc ratio (CDR) and the inferior-superior rim
ometric and appearance-based features can be length to nasal-temporal rim length ratio (IS-
easily combined, fusion of the geometric and NTR) are extracted from a fundus image. The
appearance-based features is not investigated principal components analysis (PCA) [11] is
thoroughly yet. adopted for the appearance-based feature ex-
In this paper, we propose to fuse geometric and traction. The geometric and PCA features are
appearance-based features at the feature-level. fused at the feature-level by feature concatena-
For the geometric feature extraction, a non- tion. Subsequently, the feature vector is nor-
iterative coarse-to-fine localization scheme is malized based on the min-max normalization
proposed. Particularly, a matrix multiplication [12], and expanded by the random projection
is designed to perform two-dimensional (2-D) (RP) [13, 14]. Finally, classification is per-
mean filtering at the coarse search stage. The formed based on the total error rate minimiza-
principal components analysis (PCA) [11] is tion (TER) classifier [15]. Figure 1 shows an
adopted for the appearance-based feature ex- overall flow of the proposed method.
traction. Finally, the total error rate minimiza-
tion (TER) with a random projection (RP) is 2.1 Image Preprocessing
adopted for classification. The main contri- At the preprocessing stage, image resize, mask
bution of our paper includes i) proposal of a generation, and image cropping are sequen-
feature-level fusion scheme based on the ge- tially performed for further feature extraction
ometric and appearance-based features, and and classification. 1424 2144 RGB images
ii) proposal of a matrix projection for two- are resized to 650 800 based on the bi-cubic
dimensional mean filtering. interpolation [16]. Then, a weighted image is
This paper is organized as follows: the pro- generated based on images in red, green, and
posed feature-level fusion scheme is presented blue channels as follows:
in Section 2. Section 3 provides some experi-
mental results and analysis. Finally, some con- W = wR R + wG G + wB B, (1)
cluding remarks are presented in Section 4.
where R R650800 , G R650800 , and
B R650800 are image matrices in red,
2 THE PROPOSED METHOD
green, and blue channels, respectively. Here,
In this paper, we propose to fuse geometric and W R650800 denotes a weighted image ma-
appearance-based features for glaucoma diag- trix. wR , wG , and wB denote weight values for
0 0 0 1 1 RP
&XSWR'LVF ,6WR17 3&$)HDWXUH
5DWLR 5DWLR 9HFWRU (2)
1 0 0 0
)HDWXUH&RQFDWHQDWLRQ 0LQPD[1RUPDOL]DWLRQ ... 1 . . . ..
.
..
.
1 ... . . .
7RWDO(UURU5DWH0LQLPL]DWLRQ
)HDWXUH([SDQVLRQE\ 0 0
5DQGRP3URMHFWLRQ
F2 = 0 1 . . .
L
, (3)
1 0
. ..
0 0 .. . 1
*ODXFRPDWRXV1RUPDO . . .
.. .. . . ..
1 .
Figure 1. An overview of the proposed method. 0 0 0 1 QS
where FL1 IRP and FL2 IQS are ma-
the red, green, and blue channels, respectively. trices for pre- and post- multiplications with a
The purpose of generating a weighted image is weighted image matrix, and L = 2 K + 1.
to make the cup region distinguishable for fur- Here, R = P 2 K and S = Q 2 K are set
ther localization process. to exclude the boundary pixels of the weighted
image matrix for the filtering operation. We set
2.2 Geometric Feature Extraction wR = 0.2, wG = 0.3, and wB = 0.5 to gener-
ate the weighted image to obtain the brightest
The CDR and the ISNTR are adopted as the ge-
pixel coordinates.
ometric features for glaucoma diagnosis. They
From the matrices defined in (2) and (3), the
are estimated by localizing the disc and cup re-
output matrix from the 2-D mean filtering is
gions from a fundus image. The CDR is esti-
obtained as follows:
mated based on three different measures while
the ISNTR is estimated based on a single mea- 1 L
M= 2
F1 W FL2 , (4)
sure. Finally, the geometric features extracted L
from a fundus image becomes a four dimen-
where W RP Q is a weighted image ma-
sional vector.
trix (P = 650 and Q = 800), and M RRS
is an output matrix from the 2-D mean filter-
2.2.1 Coarse Detection of Disc Region
ing. Row and column coordinates of a pixel
In order to localize the disc and cup regions with the brightest intensity are searched from
from an image, we search for coordinates of the matrix M . Here, we denote the row and
2. Histogram equalization [16] is performed 1. Vessels are removed using the morpho-
on the image without vessels for con- logical dilation operations [16] from Rc ,
trast enhancement. The image resulting Gc , and Bc . The resulting matrices from
from the vessel removal and the histogram the vessel removal are defined as Rr
equalization is defined as Hc R251251 . R251251 , Gr R251251 and Br
R251251 , respectively.
3. Threshold operation is performed on the
2. A weighted image Wr R251251 is gen-
histogram equalized image using two
erated from Rr , Gr , and Br by setting the
threshold values l and r . From an obser-
weight values as wR = 0.3, wG = 0.5,
vation regarding uneven spread of inten-
and wB = 0.2.
sity values over the left and right regions
of Hc , we apply different threshold val- 3. Element-wise matrix multiplication oper-
ues on the left and right half of Hc . We ations are performed on Wr , Gr , and Gc
set l = 0.9 V and r = 0.8 V for using Bdisc to exclude the non-disc re-
right eye images, and l = 0.8 V and gions for cup localization process. The
r = 0.9 V for left eye images. Here, V resulting matrices are defined as Wh =
denotes the intensity value of the brightest Wr Bdisc , Gh1 = Gr Bdisc , and Gh2 =
pixel of Hc . After the threshold operation, Gc Bdisc where denotes the element-
a binary matrix Bd I251251 wherein wise multiplication operator.
pixels with 1 values construct a candi-
date disc region is obtained. 4. A threshold operation is performed on
Wh R251251 using a threshold value
4. From the binary matrix Bd , a chunk of 1 c = 0.9 W where W is the highest in-
values whose center of mass is the clos- tensity value of Wh . We denote the binary
est from the coordinates (126, 126) is ex- matrix resulting from the threshold opera-
tracted and the rest chunk of 1 values tion as Bh I251251 .
Weighted Image 2-D Mean Filtered Image Coarse Detection of Disc Region Cropped Image
Binary Image (Single Region) Disc Region - Binary Disc Region - RGB
Binary Image 1 Binary Image 2 Element-wise OR Operation Cup Region - Binary Cup Region - RGB
Figure 2. Examples of intermediate results: (a) coarse detection of disc region, (b) disc localization, and (c) cup localiza-
tion.
5. Vessels with low intensity values are lo- 2.2.4 CDR Estimation
calized using a matrix difference opera-
For estimating the value of cup-to-disc ratio
tion, Gh1 Gh2 , followed by a threshold
(CDR), three different measures, namely i) ver-
operation. The threshold value v is set
tical CDR (VCDR ), ii) horizontal CDR (HCDR ),
as v = 0.8 J where J is the intensity
and iii) area based CDR (ACDR ) are defined as
value of the brightest pixel in Gh1 Gh2 .
follows:
We denote the resulting binary matrix as
Bv I251251 . VCDR = Vcup /Vdisc , (5)
HCDR = Hcup /Hdisc , (6)
6. An element-wise OR operation is per-
ACDR = Acup /Adisc , (7)
formed using Bh and Bv . The resulting
binary matrix is defined as Bc = Bh Bv where Vdisc , Hdisc , and Adisc respectively de-
where stands for the element-wise OR note a maximum value of vertical disc length,
operator. Subsequently, the morphologi- that of horizontal disc length, and the number
cal dilation operation [16] is also applied of pixels in a disc region. Similarly, Vcup , Hcup ,
to Bc . and Acup stand for the corresponding values in
a cup region.
7. An ellipse fitting based on least squares is
applied to boundaries of the candidate cup 2.2.5 ISNTR Estimation
region to obtain a fine cup region. The In order to assess the neuro-retinal rim thick-
output matrix from the cup localization is ness variations, thicknesses of inferior, supe-
defined as Bcup I251251 . The elements rior, nasal, and temporal rims are calculated
with 1 values belong to the localized cup from the disc and cup localization results. Sub-
region, and those with 0 values belong to sequently, a ratio of the total length of nasal
the non-cup region. and temporal rims over that of inferior and su-
perior rims (RISN T , ISNTR) is obtained as fol-
Figure 2. (c) shows examples of the interme- lows:
diate results which are acquired from the cup
localization process. RISN T = (LN + LT )/(LI + LS ), (8)
1RQJODXFRPDWRXV
*ODXFRPDWRXV
TER [15] as follows: tured from left and right eyes. It contains 47
1 non-glaucomatous and 24 glaucomatous im-
1 T 1 +T +
= bI + N Gtr Gtr + N + Gtr Gtr ages. Figure 3 shows five non-glaucomatous
and five glaucomatous images of the NMC data
T T
N
G + +
tr 1 + N + Gtr 1+ ,
set.
(15)
T N 1
1 = [1, . . . , 1] N and 1+ =
[1, . . . , 1]T NN 1 . We note that b is a
+
0.9 CDR-Expert
CDR-Vertical
0.8
CDR-Horizontal
0.7 CDR-Area
CDR Value
0.6
0.5
0.4
0.3
0.2
0.1
10 20 30 40 50 60 70
Image Index (1~47: Non-glaucomatous, 48~71: Glaucomatous)
Figure 6. CDR values which are estimated by the proposed method and measured by an expert (ophthalmologist).
78
76
Average Test Classication Accuracy (%)
74
72
70
68
66
58
10 50 100 150 200 250 300 350 400 450 500
Feature Dimension (Drp )
Figure 7. Average test classification accuracy performances of the geometric, PCA (before fusion), and combined features
(after fusion) with respect to feature dimension variations.
mining is classification process [7], [8], [9], of other classification methods. On the other
[10]. hand, the disadvantage of this method is the
high sensitivity to the missing values of
2.1 Classification attributes, as there are no open assumptions
about the full availability of information
Classification is the assignment of certain gathered in the database [2]. That is why it is
objects to the appropriate classes based on extremely important to prepare data properly
certain features of these objects. While before proceeding to the analysis. We can use
dividing certain objects characterized by a ERID algorithm which helps to extract
variable (qualitative or quantitative) it is knowledge from incomplete information
necessary to designate certain values of these systems [7], [8], [12].
variables as class limit values, creating a Classification trees are used to determine the
classification scheme. affiliation of objects to the dependent variables
The simplest classification scheme is a qualitative classes. This can be done by
dichotomy, which is a simple division of measuring one or more predictive variables.
objects into two classes, a class of objects The classification tree represents the process of
having given feature and a class that does not dividing a set of objects into several
have this feature. The example of such division homogeneous classes. The division is based on
is a partition of society into adults (here the value of the feature of the objects, the list
understood as people over the age of 18) and corresponds to the classes to which the objects
minors. Another selection is a fission into belong, while the edges of the tree represent
women and men. the values of the attributes from which the
The classification is based on finding a way of division was made [13], [14].
mapping data set to predefined classes. Based Tree nodes are described by the attributes of
on the content of the database, a model (such the explored relationship. The tree borders
as decision tree or logical rules) is built. It is specify all possible values for selected
then used either to classify new objects in the attribute. Tree leaves are values of a class
database or to deeper understanding of existing attribute. Classification is done by viewing the
classes. For example, in the medical tree from root to the last leaf through all edges
information systems classification rules for described by attribute values [2], [11], [13],
describing individual diseases can be extracted [14].
from knowledgebase, and then they can be The algorithm for creating a decision tree can
applied automatically in diagnosed subsequent be written as follows [2]:
patients processes [4], [11]. Step 1: For a given set of objects, make sure
that they belong to the same class (if they
2.2 Decision Trees belong - finish the procedure if they are not -
consider all possible divisions of a given set
Decision tree models are the most common into the most uniform subsets).
form of representation of knowledge Step 2: Evaluate the quality of each of these
discovered in data mining process by today's subsets according to the previously accepted
commercially available software. Decision criterion and select the best one.
tree can be treated as a form of description of Step 3: Divide the set of objects as you like.
classification knowledge [12]. Step 4: The steps to perform for each of the
Compared to the other tree classification subsets.
methods, decisions can be made very quickly. For the purpose of this paper, we use modified
The primary advantage of using decision trees C4.5 algorithm. The C4.5 algorithm is one of
is a clear and fast representation of knowledge, two most popular algorithms used in practice.
the ability to use multidimensional data and the This algorithm is actually an extension of the
use of large data sets. In addition, the accuracy ID3 algorithm. In this method, we work on
of this method is comparable with the accuracy incomplete information system, where using
the containment relation we build a new
dataset, which is more complete then the (, ) = ( )2
2
under the curve (AUC). The larger value of the [6] J. Han and M. Kamber, Data Mining: Concepts
and Techniques, Morgan Kaufmann Publishers,
AUC indicates better model: AUC = 1 (ideal Second Edition, 2006, pp. 21-27.
classifier), AUC = 0.5 (random classifier), [7] Z. Ras and A. Dardzinska, "On Rule Discovery
from Incomplete Information Systems,"
AUC <0.5 (invalid classifier, worse than [Proceedings of ICDM'03 Workshop on
random) [18], [19]. Foundations and New Directions of Data Mining,
(Eds: T.Y. Lin, X. Hu, S. Ohsuga, C. Liau),
Melbourne, Florida, IEEE Computer Society, p. 31-
Table 4. Confusion matrix for J48 algorithm 35, 2003].
[8] Z. Ras and A. Dardzinska, "Rule-based Chase
a b Classified as algorithm for partially incomplete information
systems," [Proceedings of the Second International
77 9 a=0 Workshop on Active Mining (AM'2003), Maebashi
14 52 b=1 City, Japan, October, p. 42-51, 2003].
[9] Z. Ras and A. Dardzinska, "On Rule Discovery
from Incomplete Information Systems,"
In our experiment we received ROC Curve [Proceedings of ICDM'03 Workshop on
which indicates very good classifier. The AUC Foundations and New Directions of Data Mining,
(Eds: T.Y. Lin, X. Hu, S. Ohsuga, C. Liau),
for all model is equal to 0.92. Melbourne, Florida, IEEE Computer Society, p. 31-
35, 2003].
[10] Z. Ras and A. Dardzinska, "Rule-based Chase
algorithm for partially incomplete information
5 CONCLUSIONS systems", [Proceedings of the Second International
Workshop on Active Mining (AM'2003), Maebashi
City, Japan, October, p. 42-51, 2003].
Classification methods are very useful in [11] J. Deogun, V. Raghavan and H. Sever, Rough set
modern medicine. They are very helpful in based classification methods and extended decision
tables, [International Workshop on Rough Sets
finding new symptoms and patients treatment and Soft Computing, p. 302-309, 1994].
methods. In this paper, we built classification [12] J. A. Swets, R. M. Dawer and J. Monahan, Better
decision throug science, Scientific American, ,pp.
model for dependent variable. It becomes 82-7, October 2000.
important to find the symptoms that affect [13] Y. Freund and L. Mason, The alternating decision
tree algorithm, [Proceedings of the 16th
whether the patient is ill or not. In this work, International Conference on Machine Learning,
we use J48 method to a classification task. The 1999, p. 124-133].
decision tree algorithm shows what attributes [14] W. Frawley, G. Piatetsky-Shapiro and C. Matheus,
Knowledge discovery in databases, An overview,
have the greatest impact on ulcerative colitis or Knowledge Discovery in Databases, 1991, pp.127.
are the most linked to it. [15] W. Wei, E. P. Xing, C. Myers, I. S. Mian and M. J.
Bissel, Evaluation of normalization methods for
Research was performed as a part of projects cDNA microarray data by k-NN classification,
MB/WM/8/2016 and financed with use of BMC Bioinformatics, 2004.
funds for science of MNiSW. [16] T. Liu, A. Moore and A. Gray, Efficient Exact k-
NN and Nonparametric Classification in High
Dimensions, Advances in Neural Information
Processing Systems 16 [Neural Information
Processing Systems, NIPS, p. 8-13, December,
REFERENCES 2003].
[17] A. Kasperczuk and A. Dardzinska, Comparative
Evaluation of the Different Data Mining
[1] I. Yoo, P. Alafaireet and M. Marinov, Data mining Techniques Used for the Medical Database, Acta
in healthcare and biomedicine, A survey of the Mechanica et Automatica, Vol. 10 no. 3, pp. 233-
literature, Journal of medical systems, 36(4), 2012, 238, 2016.
pp. 2431-2448. [18] J. A. Hanley, Receiver operating characteristic
[2] L. Breiman, J.H. Friedman, R.A. Olshen and C.J. (ROC) methodology: the state of the art, Crit Rev
Stone, Classification and Regression Trees, Diagn Imaging, 29(3), pp. 307-35, 1989.
Wadsworth International Group, Belmont, 1984. [19] J. A. Hanley and B. J. Mc.Neil, The meaning and
[3] A. Dardzinska, Action Rules Mining, Springer, use of the area under receiving operating
2013, pp.90. characteristic (ROC) curve, Radiology, 43, pp. 29-
36, 1982.
[4] W. Frawley, G. Piatetsky-Shapiro and C. Matheus,
Knowledge discovery in databases. An overview,
Knowledge Discovery in Databases, 1991, pp. 1
27.
[5] P.S. Levy and K. Stolte, Statistical methods in
public health and epidemiology: a look at the recent
past and projections for the next decade, Stat
Methods Med Res, 2000, pp. 9:4155.
performed according to the growing state of the The configuration of the hardware of the
crop. robot used in this research is shown in Fig 7.
In the remote control mode, Augmented Reality The size of the robot is 640 mm in length, 900
(AR) as shown in Fig 4 is used in consideration of mm in width and 900 mm in height. In
user operability. AR is a technique to add
addition, Kinect sensor (RGB + D) and PC
information created by digital synthesis or the like
to real information perceived by a person. In
are installed in the robot. As shown in Fig 8,
addition, as shown in Fig 5, a head mounted the Kinect sensor is equipped with an RGB
display (HMD) is used as a head mounted type camera and an Depth camera, and by
display. By using AR and HMD, the user can look combining these cameras it is possible to
into the farm AR displayed in the real space and acquire three-dimensional information.
monitor it by moving the robot, and it is possible
to intuitively remotely operate the robot by 2.2 Processing procedure
gripping the object become. In the proposed
system, it is thought that the robot can manage a As shown in Fig 9, Kinect should always be
large farm alone, because the robot automatically able to recognize two or more markers to be
performs many tasks. In addition, since AR can
installed on the farm. We will explain the
display information corresponding to markers, it
is possible to monitor a plurality of farms at the
whole process according to Fig 10.
same time, and improvement of production
efficiency is expected. 2.2.1 Obtaining image
In a previous research, Ide[1] proposed the
estimation method for the self-position of the We use Kinects RGB camera and Depth camera
robot based on the surrounding RGB information to create point cloud data. When two or more
and developed 3D map generation function using markers are recognized, RGB information and
the self-position using SLAM (Simultaneous depth information are acquired from Kinect.
Localization and Mapping) to realize the route of
the robot in the farm in the remote control mode. 2.2.2 Marker recognition
However, in this function misalignment occurs at
the time of alignment when the similar RGB The robot detects and recognizes markers based
information is contained in the image taken by a on the information acquired form Kinects RGB
RGB camera, making it difficult to create 3D map. camera. Since the map is generated considering
Therefore, in this research, we generate maps the angle and position of the marker, the robot
using ArUco marker[5] to enable generation of acquires the rotation matrix, the translation vector,
maps even in the place with similar RGB the marker ID, the two-dimensional coordinates of
information.. the four corners and the center of the marker.
The appearance of this research system is We create 3D point cloud data using RGB
shown in Fig 6. Place multiple markers on the information and depth information obtained
farm where you want to create the map. The from Kinect.
robot consecutively recognizes the marker,
compares it with the previously recognized 2.2.4 Map generation
scene at the recognized scene, and acquires
the information of the marker recognized, the As an example, suppose that there are scene A
RGB information, and the depth information created at point A and scene B created at point B.
when the same marker is recognized. Create
2.2.4.1 Comparison of marker IDs
point cloud data from the acquired RGB
for each scene
information and depth information, and create
a map using marker information. When generating the map, it is necessary to
match the same data with the point cloud data of
2.1 Hardware configuration each scene. Therefore, the robot searches for the
same marker ID on consecutive scenes, scene A with the position of the point cloud of
furthermore if same markers are found, the robot scene B. Assuming that the coordinates of the
calculates rotation matrix and translation vector point cloud of scene A is P, moreover the
between scenes. coordinates of the point cloud of scene A
matching the position of scene B is , is
2.2.4.2 Calculation of rotation matrix and expressed by the following equation.
translation vector between scenes (4)
Every time this map generation process is
First find the rotation matrix R. Fig 11 shows performed, the point cloud data is acquired to
how to determine the rotation matrix. We assume create a map. The created map is shown in Fig 12.
to be the normal vector of the marker. We
assume to be the normal vector of the camera 3 Evaluation experiment
of scene A furthermore to be the normal vector
of the camera of scene B. We denote the rotation In this experiment, we will experiment the
matrix for converting from normal vector accuracy of the map by comparing the map
to normal vector . Similarly, we denote the generation function created by Ide, which is a
rotation matrix for converting from normal previous study, and the map generation
vector to normal vector . Since the rotation function created in this research. Make the
matrix R is a rotation matrix from scene A to
map room in the same place and move the
scene B, it is expressed by the following
expression. robot manually. In the map generation
function of the previous study, as shown in
(1)
Next, we calculate the translation vector. Since
Fig 13, a poster carrying RGB information is
the translation vector is the same as the movement pasted on a place where RGB information is
amount of kinect, it can be obtained from the scarce and a map is created. In this study,
difference between the center coordinates of the ArUco marker was pasted as shown in Fig 14
markers for each scene. It is necessary to equalize and map creation was done. we created a map
the angle of the markers between the scenes. This of the same room and asked 10 subjects to
can be calculated using the rotation matrix R evaluate in 5 stages. Evaluation items are shown
obtained earlier. We assume to below. Figures 15 and 16 show the results of the
be the center coordinate of the marker of scene A map.
furthermore to be the center [1] Is the map compared with the map generation
coordinate of the marker of scene A whose function of the previous research more
rotation is matched with the marker of scene B. reproducible?
The center coordinate can be obtained from [2] Is the created map reproduce the corner of the
the following expression. room compared to the actual room?
[3] Are the walls reproduced in the map created
compared with the actual room?
(2)
The evaluation results are shown in Table 1.
We assume that the center coordinate of the Table 1: Result of evaluation experiment
marker of scene B is . Using the Evaluation average Standard
result of equation (2), the translation vector item deviation
T can be calculated by the following [1] 4.9 0.30
equation. [2] 4.6 0.49
[3] 4.4 0.49
(3) Because the value of [1] is high, we can see
that the map creation of the proposed method
has better room reproducibility than the map
2.2.4.3 Map construction generation function of the previous research.
In addition, since the average value of items
Using the result of equation (1) and equation (3), in [2] and [3] is also high, the corners of the
the robot aligns the positions of point cloud of
REFERENCES
ABSTRACT
2. ENGLISH IN THE WORLD ARAB
Speech recognition for the academic and
standard languages has achieved a great The Arab world is the Middle East and
development, but in the face of the multiple North Africa (MENA), divided into 22
dialects with their differences, there is still a countries, 10 of which are African. The Arab
noticeable lack of recognition rates, especially if region has an area of about 14 million km2,
these dialects are for non-native speakers of the equivalent to 10.2% of the world's area and
target language. The challenges facing the contains about 6% of the world population.
recognition or the identification of non-native where Arabic is the official language of these
dialects are numerous, among them the lack of
countries.
sound databases, whether approved or not.
This article presents part of our work on The status of the English language in the
creating a sound database of non-native English Arab world differs between a Second
dialects for Arabs speakers. This database will be Language (ESL1) and Foreign Language
used later by our Automatic System to Help (EFL2).
Learning English as Foreign Language
(ASHLEFL). It contains initially three non-native In its report of 2016, the organization EF
English dialects respectively Qataris, Egyptians, "Education First" of the language training,
and Pakistanis speakers. educational travel, academic certification and
cultural exchange program [1], revealed that
KEYWORDS MENA countries have the lowest level of
English proficiency with the average EF EPI3
Nonnative English Dialect; Sound Database; Arab
equal 44.92 in the world (72 countries were
speakers.
involved in the tests, including 10 Arab
countries). The Arab countries were ranked in
1. INTRODUCTION the "very low" efficiency group with the
exception of the United Arab Emirates and
This article is part of a project to develop Morocco, which were classified as "low", as
an automatic system to help to learn English shown below Figure 1.
as a foreign language. The role of this system This weakness is attributed to several
is to detect, then to correct the common errors reasons: historical, religious, cultural,
of the pronunciation of the determined economic, educational, and sometimes
populations, where we begin with the political, without forgetting the great
identification of their dialects. The objective difference between the Arabic and English
of this paper is the presentation of the
1
construction of an audio data base of non- English Second Language
2
native English dialects of these populations. English Foreign Language
3
English Proficiency Index
language systems, which greatly affects the academic languages as well as their native
pace of learning as English is ESL or EFL [2] dialects remains persistent. As for non-native
and [3]. To improve the efficiency of the dialects, this is another challenge.
Arabs in mastering the English language
requires the development of the educational The English has taken the lion's share of
system and attention to language learning interest and research as the most widely
from the first levels of education. spoken language (more than 500 million
speakers). [4] presents some of the native
English dialects databases that were created
are as follows ANDOSL4, SEC5,
6 7
WSJCAMO , TIMIT . TIMIT is the famous
and the widely used speech database. It
contains 630 native speakers of American
English (70% male and 30% female). Each
speaker reads 10 sentences and takes
approximately 30 seconds. [5] provides an
overview of databases existing of non-native
speakers and notes the total absence of Arab
speakers of English (Table 1). We note that
free or paid databases are still very limited,
especially for non-native English dialects in
the Arab world, which sometimes obliges
laboratories to create their own database
internally. So, for to approach this project we
need to create our own audio database of
nonnative English dialects by Arab speakers.
The voice recording has been performed TABLE 2 DIVIDE FOR SPEAKERS
over several sessions in in real acoustic FEMALE MALE TOTAL
EGYPT 04 03 07
environment using Sony Dictaphone.
PAKISTANI 02 13 15
QATARI 03 09 12
we asked each speaker to read five times in TOTAL 09 25 34
one session the first ten numbers, twelve
isolated words divided into five groups, five
short sentences, and a paragraph of 69 words. B. Codification
it wasn't a spontaneous read but prepared read
After cutting the recording into audio files.
by each speaker.
We moved to the coding process where we
gave each audio file a nine-digit code with
The length of the recordings ranged from 2
information about the country, city, speaker,
to 6 minutes (some speakers did not respect
and number of utterances
the number of repetition of reading). After we
received all the recordings, we processed
them. Where we started the process of cutting
into recordings of length ranging from one
second to 29 seconds using the free demo
program Power MP3 Cutter Professional
Version 6.2. (Figure 2). Each record contains
one utterance: the ten numbers, one group of
isolated words, one sentence, or the paragraph
Figure.3 Code of recording sound