Sei sulla pagina 1di 4

Foetal Weight Estimation by Support Vector Regression

Fernando Sereno* , J.P. Marques de Sá* , Ana Matos † , João Bernardes ††

*
FEUP – Faculdade de Engenharia da Universidade do Porto, Portugal
E-mail: fsereno@fe.up.pt

HSJ – Hospital de S. João, Dep. Ginecologia e Obstectrícia, Porto, Portugal
††
FMUP - Faculdade de Medicina da Universidade do Porto, Portugal
INEB - Instituto de Engenharia Biomédica, Porto, Portugal

Abstract solve real-life problems and is based in the


assumption that the investigator knows the
problem, the function to be found up to a finite
Foetal weight estimation based on echographic
number of parameters.
measurements has paramount importance. This
paper reports some results using data taken from a Using information about the statistical law and the
dataset of four Portuguese hospitals that maximum likelihood method applied to the data
participated in the collection of clinical and one finds the target function and estimates its
echographic data (414 cases) during 1998-99. parameters, which is the essence of classical
Firstly, it revises some theoretical concepts from Fisherean inference.
Statistical Learning Theory. Then, reports some
results using Support Vector regression to predict When one does not have reliable a priori
foetal weigh in lower and higher bands. Finally, it information about the statistical law underlying the
concludes that the error given by the SVR methods problem or about the class of functions and the
are better than traditional formulas and can be conditions under which one can get better
improved further. approximations with an increasing number of
examples, one is in the general inference approach,
a development that was started by Glivenko,
1 Introduction Cantelli and Kolmogorov.
In the last 40 years of research this approach
Foetal weight estimation based on echographic culminated in inductive methods, a different type
measurements has paramount importance in of inference which is more general and more
delivery risk assessment [1,2]. powerful than parametric inference [4,5].
The research objective is to know to what extent
SV machines can improve over the 15% error of
FW estimation performed by prediction formulas in
current day clinical use [3]. 3 Minimizing the risk functional
Four Portuguese hospitals participated in the from empirical data
collection of clinical and echographic data (414
cases) during 1998-99, according to a protocol. The basic problem is to formulate a constructive
Each case consists of foetal weight (FW) at birth, criterion for choosing from parametric sets of
and five echographic measurements, taken one functions one function that minimizes the
week before birth. These are: biparietal diameter, mathematical expectation
cephalic circumference, abdominal circumference,
femur length and umbilical artery resistance index.. R(α ) = ∫Q( z, α )dF ( z ) , α ∈ Λ, (1)

where Q(z,α) is called the loss function, z is a


2 Statistical inference variable that represents random independent
observations z1 ,… , z l , obtained according to
Parametric statistics aims to create simple
statistical methods of inference that can be used to unknown distribution F(z), α is a parameter from a
Page 1
set Λ, arbitrary, it can be a set of scalar quantities, a The distance between empirical and expected risk,
set of vectors or a set of abstract elements, and the involving the number of examples l , and the
integral is Lebesgue-Stieltjes for a bounded capacity h of the function space, a quantity
nonnegative function [4]. measuring the “complexity” of the space can be
bounded by a probabilistic measure [4,5].

4 The problem of regression


estimation
Estimating the stochastic dependence based on
6 SVR Method
empirical data pairs ( y1 , x1 ), K , ( y l , x l ), taken In the Support Vector approach the basic problem
randomly and independently from a joint is to formulate a constructive criterion for choosing
distribution function F(x,y), means estimating the from parametric sets of functions one function that
conditional distribution function F(y|x). This is minimizes the expected risk (1). One cannot
often an ill-posed problem [4,5,6], that can minimize this functional directly since one ignores
however be determined by the mathematical the probability distribution function F(z). Instead
expectation one can use the classical induction principle based
on empirical data pairs ( y1 , x1 ), K , ( yl , xl ), that
r ( x) = ∫ydF ( y x) (2) consists in minimizing an empirical risk functional,
for example (4). There are probabilistic bounds on
called regression function. It can be shown that this the distance between empirical and expected risks
function can be estimated for sets of functions involving the number of examples l and the
f(x,α) in the metric L2(P), by minimizing the capacity h of the function space, a quantity
functional measuring the “complexity” of the space.
R(α ) = ∫( y − f ( x, α ) ) dF ( x, y )
2
(3)
The solution of learning f(x,α) is found by solving
for each constant Am, related to hypothesis spaces,
an optimization problem:
5 Principle of Empirical Risk 2
1 l
Minimization min f ∑ Q( y , f ( x ) + λ f (6)
l i =1 i i K'

One cannot minimize the functional (3) directly subject to


since one ignores the probability distribution
function F(x,y) that defines the risk. Instead one f K
≤ Am , (7)
can use the classical induction principle, that
and choosing among the solutions found for each
consists in minimizing the empirical risk functional
Am, the one with the best trade off between
2 empirical risk and capacity [4,5,7].
1 l
Remp = ∑ ( y i − f ( x i , α )) , α ∈ Λ (4) The regularization parameter λpenalizes functions
l i=1
with high capacity. In Support Vector Regression
on the base of empirical data pairs (SVR) we used the loss function :

( y1 , x1 ),K , ( y l , x l ). Q( yi , f ( xi )) = yi − f ( xi ) ε (8)

It can be shown that under particular conditions where the function |.|ε , is called ε-insensitive loss.
there is uniform convergence to the mathematical The function given has the general form:
expectation (3) of an empirical measure estimator,
f ( x ) = ∑ i=1 ci K ( x, x i )
e.g. the functional (4), therefore l
(9)

sup R(α ) − Remp (α )  → 0, as l → ∞ (5)


P
The data points xi associated with nonzero ci are
α∈Λ called support vectors and they represent the most
informative data points and compress the
which means that the principle of minimizing the information contained in the training set.
empirical risk provides a sequence of functions that
converges in probability to the best solution.

Page 2
7 Experimental results 8 Conclusions
The SVR training algorithm [8] has been tested on SVR is equivalent to maximizing the margin
two subsets of our Foetal weight (FW) data set, between training examples and the regression
each one corresponding to the inferior and superior function. It is an alternative to other neural
tails of the FW distribution function, as shown in networks with training methods that optimize cost
figures 1 and 2. The central and most frequent functions such as the mean square error, therefore
cases will not belong to our sub-sets. The it can be applied to FW estimation.
experiment consists in the performance
SVR is motivated by the statistical learning theory,
determination in a test separate set.
which characterizes the performance of SVR
The predicted FW (FWpred) was computed from learning using bounds on their ability to predict
two echographic features abdominal circumference future data.
(AC) and femur length (FL), in two different
The training consists in solving a constrained
portions of the distribution function, with almost
quadratic optimization problem [4,5,9]. Among
the same number of examples. The polynomial
others, this implies that there is a unique optimal
kernels used in this experiment were of order ≤ 7. solution for each choice of the SVR parameters.
The ε-insensitive loss function used values of This is unlike other learning machines, such as
ε >0.05. The number of support vectors returned by standard Neural Networks trained using
our algorithm was SVinf=90.5% and SVsup=96.5%, backpropagation.
respectively in the inferior and superior tails of the
FW distribution function.
Finally, the error rates we got were Einf=11.2%
Esup= 10.0%, in the inferior and superior tails of the
FW distribution function, respectively.

SVR prediction LOW values of FW


4500 SVR prediction HIGH values of FW
4500

4000
4000
FW (blue) - Estimated FW (red)

FW (blue) - Estimated FW (red)

3500
3500

3000
3000

2500
2500

2000
2000

1500
1500

1000
5 10 15 20 25 30 1000
#case 5 10 15 20 25 30 35
#case

Figure 1 – Support Vector Regression (SVR) Figure 2 – Support Vector Regression (SVR)
predicted Low foetal weights (FW) (inferior tail of predicted High foetal weights (FW) (superior tail
the distribution function). Graphical representation of the distribution function). Graphical
of a sample of 30 real and estimated FW, using a representation of a sample of 38 real and estimated
SVR with polynomial kernel of the 7th grade, a FW, using a SVR with polynomial kernel of the 7th
10% ε-insensitive loss function, regularization grade, a 10% ε-insensitive loss function,
parameter λ= 1000, trained with a separate sub-set regularization parameter λ = 1000, trained with a
of 60 cases. The real foetal weights are ordered separate sub-set of 66 cases. The real foetal
increasingly and represented by dots, and the weights are ordered increasingly and represented
corresponding estimated values are represented by by dots, and the corresponding estimated values are
circles (the lines connecting these circles are for represented by circles (the lines connecting these
visualization purposes only) circles are for visualization purposes only)

Page 3
Acknowledgement
The authors would like to thank to Steve Gunn and
the Image Speech & Intelligent Systems Group,
University of Southampton, United Kingdom, for
letting us experiment the Matlab software
developed for Support Vector Machines for
Classification and Regression.

References
[1] Farmer R.M., Medearis A.L., Hirata G.I., Platt
L.D., 1992, “The Use of a Neural Network for
the Ultrasonographic Estimation of Foetal
Weight in Macrosomic Fetus”, Am J Obstet
Gynecol ,May 1992.
[2] Chauhan S. P. et al., 1998, “Ultrasonographic
estimate of birth weight at 24 to 34 weeks: A
multicenter study”, Am J Obstet Gynecol
October 1998.
[3] Sereno F, Marques de Sá J.P, Matos A,
Bernardes, “The Application of Radial Basis
Functions and Support Vector Machines to the
Foetal Weight Prediction”, in Dagli C H et al.
(eds.), Proceedings of ANNIE ' 2000, Smart
Engineering System Design Conference, St
Louis, 2000.
[4] Vapnik V.N., Statistical Learning Theory,
New York, Springer, 1998.
[5] Cristianini N. & Shawe-Taylor J., An
Introduction to Support Vector Machines: And
Other Kernel-Based Learning Methods,
Cambridge, Cambridge University Press, 2000
[6] Haykin S., Neural Networks - A
Comprehensive Foundation (2d Edition), New
York ,Prentice Hall, 1999.
[7] Cherkassky V, Mulier F., Learning From Data
– Concepts, Theory, and Methods, New York,
John Wiley & Sons, Inc., 1998
[8] Gunn S., Support Vector Machines for
Classification and Regression, Image Speech
& Intelligent Systems Group, University of
Southampton, United Kingdom, 1998.
[9] Evgeniou T, Pontil M, Workshop on support
vector machines, theory and applications,
Center for Biological and Computational
Learning, and Artificial Intelligence
Laboratory, MIT, Cambridge, 2000.

Page 4

Potrebbero piacerti anche