Sei sulla pagina 1di 16

Models Lineals

1-The L1 models

Pere Puig
Department de Matemtiques
Universitat Autnoma de Barcelona
Advanced Statistical Modelling Research Group

Models Lineals

The shape of the Earth


It was Isaac Newton who first claimed that the Earth is not
spherical, but oval (Principia, 1687).

Newton imagined two wells going


down to the center of the Earth: one
from the North Pole, and the other
from the equator, both filled with
water. The water in the equatorial well
is subject to the centrifugal force, and
the water in the Polar well is not. For the two columns of
water to be in equilibrium, it follows that the equatorial well
must be longer.

Models Lineals

Newton estimated a flattening


of 1/230. However, the
measures of the Paris meridian
performed by Cassini wrongly
suggested that the Earth was
elongated at the poles.
In order to resolve the
disagreement, the French
Academy of Sciences,
organized two expeditions to
perform measurements along a
meridian arc at places as distant
as possible: one to the equator
(Peru), the other to the high
North (Lapland).

Models Lineals

The equatorial mission was led by French scientists


Charles Marie de La Condamine, Pierre Bouguer, Louis
Godin and Spanish navy officers Jorge Juan and Antonio
de Ulloa. Jorge Juan wrote:
...la hiptesis de Cassini est sustentada
por experiencias innegables...
Mientras que los contrarios se apoyan en
teoras sutiles que por ingeniosas que
fuesen podan estar muy lejos de la
verdad.

Models Lineals

Maire and Boscovich reported in 1755 all the known one degree
arc lengths measured in toises:

Models Lineals

The relationship between arc length and latitude is given by


an elliptic integral. For short arcs, a good approximation is,
where a is the length of 1o of latitude centered at latitude , c
is the length at the equator and d is the excess of 1o at the
North Pole over one at the equator. The flattening or
ellipticity f is found as
In 1757 Josip Boscovich, using the 5 observations of the
table, he solved this overdetermined system of equations
(5 equations, 2 unknowns), proposing the following method:
- Consider,
- The values of c and d should be chosen in such a way that

is minimized.

Models Lineals

First condition leads to the relationship,


where
and
Therefore, second condition leads to find the minimum of

where

How to solve this problem?: Given a set of points (xi,yi),


i=1,...n, calculate which minimizes,

Models Lineals

Exercise: Given the sample x1,...,xn, find the value of c which


minimizes,

(L1 norm)
(L2 norm)
(L norm)
(weighted L2 norm, i>0)

The solutions are descriptive statistics used in practice.

Models Lineals

Proposition: Let x1,x2,...,xn be an ordered sample (xi<xi+1) and


1,..., n a set of weighs, i >0. Consider the strictly increasing
succession defined as,

The minimum of the function


at , where
1, if is the first positive value and
2- is any value of the interval [xs,xs+1] if
Proof:

, is attached

Remark: according to Edgeworth, is known as weighted median.


Remark: the statement is not restrictive when some xi are repeated.

Models Lineals

Example: Boscovichs calculations


First note that,
The values equal to zero have not influence in the minimization
problem.

Models Lineals

Therefore

, and using

, we obtain

It corresponds to an estimated ellipticity of,


and it agrees with the Newtons estimation.
Boscovich was very happy because Newtons theory was
verified.

Models Lineals

The L1 solutions
A general problem: Consider an overdetermined system of
equations
, where A is a nxm matrix, n>m, and
A solution of the system could be a vector
which
minimizes the difference between b and Ax,
, for a
certain norm:
L1 solution:
(LAD method)
L2 solution:
(OLS method)
where ai is the i-th row of the matrix A.
Note that the condition used by Boscovich,
is not imposed here.

Models Lineals

The L1 solution is equivalent to a linear programming problem:

These constraints are non-linear but can be changed by two


equivalent linear constraints,

Note that this means that


, but since our model is
trying to minimize eis, in the optimal solution the value of each
ei will be taken as
.

Models Lineals

Example: A two variables predicting model


A company wants to study the sales staff at existing stores to
determine how intelligence and extroversion predict sales
performance of current employees.
We want to solve an overdetermined
system of equations:
We look for the L1 solution, that is,

To do this, we use the function lp of


the package lpSolve in R.

Models Lineals

Set up problem:

Models Lineals

Properties of the L1 solution:


- For a specific data set there is possible to have multiple
solutions (think in the median!).
- For any problem with k parameters to determine (including
the constant), there are at least k values of ei that will be zero at
the optimal solution.
An alternative method: Check all combinations of point-topoint surfaces for finding the minimum sum of errors.