Sei sulla pagina 1di 50

Detecting Signal from Data with Noise

Xianyao Chen

Meng Wang, Yuanling Zhang, Ying Feng
Zhaohua Wu, Norden E. Huang


Laboratory of Data Analysis and Applications, SOA, China
The First Institute of Oceanography, State Oceanic Administration, China
Adaptive Data Analysis and Sparsity
California, 2013
Motivation
Identify the meaning of each IMFs, whether it
is noise, or signal, or when it is noise, or signal.
Motivation
Identify the meaning of each IMFs, whether it
is noise, or signal, or when it is noise, or signal.
Motivation
Identify the meaning of each IMFs, whether it
is noise, or signal, or when it is noise, or signal.
Time: Year
L
a
t
i
t
u
d
e


2003 2004 2005 2006 2007 2008
-60
-40
-20
0
20
40
60
-1
-0.5
0
0.5
1
Time: Year
L
a
t
i
t
u
d
e


2003 2004 2005 2006 2007 2008
-60
-40
-20
0
20
40
60
-1
-0.5
0
0.5
1
Time: Year
L
a
t
i
t
u
d
e


2003 2004 2005 2006 2007 2008
-60
-40
-20
0
20
40
60
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
Time: Year
L
a
t
i
t
u
d
e


2003 2004 2005 2006 2007 2008
-60
-40
-20
0
20
40
60
-6
-4
-2
0
2
4
6
Time: Year
L
a
t
i
t
u
d
e


2003 2004 2005 2006 2007 2008
-60
-40
-20
0
20
40
60
-1
-0.5
0
0.5
1
NOISE or SIGNAL?

Characteristics of white noise
Two views of white noise: EMD and Fourier
Characteristics of white noise
Two views of white noise: EMD and Fourier
Flandrin et al. 2004, IEEE.
Characteristics of white noise
Two views of white noise: EMD and Fourier
Wu et al. 2004, Proc. Roy. Soc. Lon.
Characteristics of white noise
Two views of white noise: EMD and Fourier
Wu et al. 2004, Proc. Roy. Soc. Lon.
Detecting signal with white noise
Wu et al. 2004, Proc. Roy. Soc. Lon.
1 mon 1 yr 10 yr 100 yr
The null hypothesis:
The underlying noise
is white.
Problem: How to detect signal from color noise?
white pink red



blue purple gray
wikipedia
Taking red noise as an example
( )
,
dx
f x t
dt
=
General characteristics of noise
( ) ( ) ( )
1
p
k
k
x t a x t k t e t
=
= A +

First study the Auto-Regressive processes


( )
2
2
2
1
1
e
p
i kf
k
k
s f
a e
t
o

=
=
(

IMF-1
F
r
e
q
u
e
n
c
y
0 0.2 0.4 0.6 0.8 1
0
0.1
0.2
0.3
0.4
0.5
IMF-2
0 0.2 0.4 0.6 0.8 1
IMF-3
Time
0 0.2 0.4 0.6 0.8 1
IMF-4
0 0.2 0.4 0.6 0.8 1
IMF-5
0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5
Spectrum


IMF-1
IMF-2
IMF-3
IMF-4
IMF-5
Data
1 2 3 4 5 6 7 8 9 10
0
20
40
60
80
100
120
lnT
s
p
e
c
t
r
u
m


IMF 1
IMF 2
IMF 3
IMF 4
IMF 5
IMF-1
F
r
e
q
u
e
n
c
y
0 0.2 0.4 0.6 0.8 1
0
0.1
0.2
0.3
0.4
0.5
IMF-2
0 0.2 0.4 0.6 0.8 1
IMF-3
Time
0 0.2 0.4 0.6 0.8 1
IMF-4
0 0.2 0.4 0.6 0.8 1
IMF-5
0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5
Spectrum


IMF-1
IMF-2
IMF-3
IMF-4
IMF-5
Data
0 2 4 6 8 10 12
-12
-10
-8
-6
-4
-2
0
2
4
6
8
10
log
2
(mean period)
l
o
g
2
(
m
e
a
n

n
o
r
m
a
l
i
z
e
d

e
n
e
r
g
y
)
99% percenta line
1
2
3
4
4
5
6
Color noise will pass the
significance test based on
white noise null hypothesis.
( )
2
2
2
1
1
e
p
i kf
k
k
s f
a e
t
o

=
=
(

IMF-1
F
r
e
q
u
e
n
c
y
0 0.2 0.4 0.6 0.8 1
0
0.1
0.2
0.3
0.4
0.5
IMF-2
0 0.2 0.4 0.6 0.8 1
IMF-3
Time
0 0.2 0.4 0.6 0.8 1
IMF-4
0 0.2 0.4 0.6 0.8 1
IMF-5
0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5
Spectrum


IMF-1
IMF-2
IMF-3
IMF-4
IMF-5
Data
( ) ( ) ( )
1
x t x t t e t o = A +
1 2 3 4 5 6 7 8 9 10
0
20
40
60
80
100
120
lnT
s
p
e
c
t
r
u
m


IMF 1
IMF 2
IMF 3
IMF 4
IMF 5
1 2 3 4 5 6 7 8 9 10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
lnT
s
p
e
c
t
r
u
m


IMF 1
IMF 2
IMF 3
IMF 4
IMF 5
( ) ( ) ( )
1
x t x t t e t o = A +
AR1 - normalized spectrum
IMF-1
F
r
e
q
u
e
n
c
y
0 0.2 0.4 0.6 0.8 1
0
0.1
0.2
0.3
0.4
0.5
IMF-2
0 0.2 0.4 0.6 0.8 1
IMF-3
Time
0 0.2 0.4 0.6 0.8 1
IMF-4
0 0.2 0.4 0.6 0.8 1
IMF-5
0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5
Spectrum


IMF-1
IMF-2
IMF-3
IMF-4
IMF-5
Data
t A
( )
2
2
2
1
1
e
p
i kf
k
k
s f
a e
t
o

=
=
(

1 2 3 4 5 6 7 8 9 10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
lnT
s
p
e
c
t
r
u
m


IMF 1
IMF 2
IMF 3
IMF 4
IMF 5
AR1 - normalized spectrum [1.0 1.2]t
Changing sampling rate
( ) ( ) ( )
1
x t x t t e t o = A +
AR1 - normalized spectrum [1.0 1.2 1.4] t
1 2 3 4 5 6 7 8 9 10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
lnT
s
p
e
c
t
r
u
m


IMF 1
IMF 2
IMF 3
IMF 4
IMF 5
Changing sampling rate
( ) ( ) ( )
1
x t x t t e t o = A +
AR1 - normalized spectrum [1.0 1.2 1.4 1.6] t
1 2 3 4 5 6 7 8 9 10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
lnT
s
p
e
c
t
r
u
m


IMF 1
IMF 2
IMF 3
IMF 4
IMF 5
Changing sampling rate
( ) ( ) ( )
1
x t x t t e t o = A +
AR1 - spectrum [1.0 1.2 1.4 1.6] t
1 2 3 4 5 6 7 8 9 10
0
20
40
60
80
100
120
140
160
180
200
lnT
s
p
e
c
t
r
u
m


IMF 1
IMF 2
IMF 3
IMF 4
IMF 5
Changing sampling rate
( ) ( ) ( )
1
x t x t t e t o = A +
Noise is a time series whose characteristics are determined by the
sampling rate.
0 10 20 30 40 50 60 70 80 90 100
0
0.2
0.4
0.6
0.8
1
Noise is a time series whose characteristics are determined by the
sampling rate.
0 10 20 30 40 50 60 70 80 90 100
-1
-0.5
0
0.5
1
0 10 20 30 40 50 60 70 80 90 100
-1
-0.5
0
0.5
1
0 10 20 30 40 50 60 70 80 90 100
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40 50 60 70 80 90 100
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
The true signal will not be destroyed, eliminated, or
distorted by re-sampling, unless the re-sampling rate is
too long to identify a whole period.

0 10 20 30 40 50 60 70 80 90 100
-1
-0.5
0
0.5
1
Noise is a continuous process, whose characteristics are
determined once observed by a specific sampling rate.
AR1 - normalized spectrum [1.0 1.2 1.4 1.6]
1 2 3 4 5 6 7 8 9 10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
lnT
s
p
e
c
t
r
u
m


IMF 1
IMF 2
IMF 3
IMF 4
IMF 5
Can this feature be identified by Fourier analysis?
10
-3
10
-2
10
-1
10
0
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
10
1


1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
Can this feature be identified by Fourier analysis? - NO
10
-3
10
-2
10
-1
10
0
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
10
1


1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
Quantify the difference using HHT
( )
( )
( )
1
0
1
0
:
Hilbert Marginal Spectrum:
Frequency:
M k
k
M k
MK
S d
SWMF
S d
S
e
e
e
e
e e e
e
e e
e
e
=
}
}
1 2 3 4 5 6 7 8 9 10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
lnT
s
p
e
c
t
r
u
m


IMF 1
IMF 2
IMF 3
IMF 4
IMF 5
SWMF: Spectrum-Weighted-Mean Frequency
Quantify the difference using HHT
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
0.4
0.5
0.6
0.7
0.8
0.9
1
Re-sampling Rate
N
o
r
m
a
l
i
z
e
d

F
r
e
q
u
e
n
c
y


IMF-1
IMF-2
IMF-3
IMF-4
IMF-5
( )
( )
1
0
1
0
:
M k
k
M k
S d
SWMF
S d
e
e
e
e
e e e
e
e e
=
}
}
Adaptive Null Hypothesis
H
0
: The time series under investigation contains nothing but random noise.
H
1
: Reals signals are presented in the data.
Testing method:
Characteristics of the method
Valid for many different kinds of noise (not all tested)
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
0.4
0.5
0.6
0.7
0.8
0.9
1
Re-sampling Rate
N
o
r
m
a
l
i
z
e
d

F
r
e
q
u
e
n
c
y


IMF-1
IMF-2
IMF-3
IMF-4
IMF-5
Tested:
White
Red (AR, fGn)
Ultraviolet (fGn)
Characteristics of the method
Valid for nonstationary time series

Year


2007 2008 2009 2010
100
200
300
400
-1
-0.5
0
0.5
S
a
m
p
l
e

I
n
d
e
x


100
200
300
400
-4
-2
0
2

100
200
300
400
-4
-2
0
2
Characteristics of the method
Valid for nonstationary time series

Year


2007 2008 2009 2010
100
200
300
400
-1
-0.5
0
0.5
S
a
m
p
l
e

I
n
d
e
x


100
200
300
400
-4
-2
0
2

100
200
300
400
-4
-2
0
2
Characteristics of the method
Valid for nonstationary time series

I
M
F
-
1
1.0
0
0.1
0.2
0.3
0.4
0.5
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9
I
M
F
-
2
0
0.1
0.2
0.3
I
M
F
-
3
0
0.03
0.06
0.09
0.12
0.15
I
M
F
-
4
0
0.01
0.02
0.03
0.04
0.05
0.06
Characteristics of the method
Valid for nonstationary time series

( )
( )
1
0
1
0
:
M k
k
M k
S d
SWMF
S d
e
e
e
e
e e e
e
e e
=
}
}
Examples - I
0 50 100 150 200 250 300 350 400 450 500
-1
0
1
2
3
4
5


Noise
Noise+0.1*Signal
0.3 Signal
0.2 Signal
0.1 Signal
Examples - I
1 1.2 1.4 1.6 1.8 2
0.5
0.6
0.7
0.8
0.9
1
N
o
r
m
a
l
i
z
e
d

F
r
e
q
u
e
n
c
y
(a) Amplitude = 0


IMF-1
IMF-2
IMF-3
IMF-4
1 1.2 1.4 1.6 1.8 2
0.5
0.6
0.7
0.8
0.9
1
(b) Amplitude = 0.1
1 1.2 1.4 1.6 1.8 2
0.5
0.6
0.7
0.8
0.9
1
(c) Amplitude = 0.2
Re-sampling Rate
N
o
r
m
a
l
i
z
e
d

F
r
e
q
u
e
n
c
y
1 1.2 1.4 1.6 1.8 2
0.5
0.6
0.7
0.8
0.9
1
(d) Amplitude = 0.3
Re-sampling Rate
Examples - II
1996 1998 2000 2002 2004 2006 2008 2010 2012
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
x 10
4
Year
Dow Jones Indices
Examples - II
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
0.4
0.5
0.6
0.7
0.8
0.9
1
Re-sampling rate
N
o
r
m
a
l
i
z
e
d

F
r
e
q
u
e
n
c
y
Examples - III Sea Surface Temperature (SST)
Year
S
S
T
Daily SST from AMSR-E


2003 2004 2005 2006 2007 2008 2009 2010 2011
25
30
35
40
45
5
10
15
20
25
30
Examples - III
P
e
r
i
o
d
Year
03 04 05 06 07 08 09 10 11
60
30
15
10
Examples - III
Time: Year
L
a
t
i
t
u
d
e


2003 2004 2005 2006 2007 2008
-60
-40
-20
0
20
40
60
-1
-0.5
0
0.5
1
Time: Year
L
a
t
i
t
u
d
e


2003 2004 2005 2006 2007 2008
-60
-40
-20
0
20
40
60
-1
-0.5
0
0.5
1
Time: Year
L
a
t
i
t
u
d
e


2003 2004 2005 2006 2007 2008
-60
-40
-20
0
20
40
60
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
Time: Year
L
a
t
i
t
u
d
e


2003 2004 2005 2006 2007 2008
-60
-40
-20
0
20
40
60
-6
-4
-2
0
2
4
6
Examples - III Sea Surface Temperature (SST)
12
16
20
24
T
e
m
p
e
a
r
t
u
r
e
26
28
30
32
T
e
m
p
e
a
r
t
u
r
e
Examples - III
0.1 1 10
10
-6
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
Frequency: cycle per year
P
o
w
e
r

S
p
e
c
t
r
u
m


1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
Examples - III
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
0.5
0.6
0.7
0.8
0.9
1
Re-Sampling Rate
N
o
r
m
a
l
i
z
e
d

F
r
e
q
u
e
n
c
y


Annual
IMF-4
IMF-3
IMF-2
Examples - III
26
28
30
32
T
e
m
p
e
a
r
t
u
r
e
0.1 1 2 10
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
Frequency: cycle per year
P
o
w
e
r

S
p
e
c
t
r
u
m


1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
Examples - III
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
0.5
0.6
0.7
0.8
0.9
1
Re-Sampling Rate
N
o
r
m
a
l
i
z
e
d

F
r
e
q
u
e
n
c
y


Annual
Semi Annual
IMF-3
IMF-2
Conclusion
An adaptive null hypothesis for testing the characteristics of background and further detecting the
signal from data with unknown noise are proposed.

The proposed adaptive null hypothesis and fractional re-sampling technique (FRT) has several
advantages for detecting signals from noisy data:
It is based on one of the general characteristics of noise processes, without pre-defined
function form or a prior knowledge of background noise. This makes the method effective
when dealing with many real applications, in which neither signals nor noise is known before
analysis.
It is based on the EMD method, which is developed mainly for analyzing nonlinear and
nonstationary time series. Notice that both the null hypothesis and the testing methods do not
involved linear or stationary assumptions. Therefore, this method is valid for nonlinear and
nonstationary processes, which is very often the case in real applications.

Potrebbero piacerti anche