Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 1 / 29
What is statsmodels?
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 2 / 29
What is Time Series Analysis?
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 3 / 29
Talk Overview
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 4 / 29
Statsmodels development update
http://github.com/statsmodels/statsmodels
http://statsmodels.sourceforge.net
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 5 / 29
Statsmodels development update
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 6 / 29
Aside: statistical data structures and user interface
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 7 / 29
Aside: statistical data structures and user interface
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 8 / 29
Example data: EEG trace data
300
200
100
100
200
300
400
500
600
0 500 0 0 0 0 0 0 0
100 150 200 250 300 350 400
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 9 / 29
Example data: Macroeconomic data
5.5
5.0 cpi
4.5
4.0
3.5
3.0
7.5
7.0 m1
6.5
6.0
5.5
5.0
4.5
9.5
9.0
realgdp
8.5
8.0
0 4 8 2 6 0 4 8 2 6 0 4 8
196 196 196 197 197 198 198 198 199 199 200 200 200
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 10 / 29
Example data: Stock data
800
AAPL
700 GOOG
MSFT
600 YHOO
500
400
300
200
100
0
1 2 3 4 5 6 7 8 9
200 200 200 200 200 200 200 200 200
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 11 / 29
Descriptive statistics
Autocorrelation, partial autocorrelation plots
Commonly used for identification in ARMA(p,q) and ARIMA(p,d,q)
models
acf = tsa . acf ( eeg , 50)
pacf = tsa . pacf ( eeg , 50)
0.5 0.5
0.0 0.0
0.5 0.5
1.00 10 20 30 40 50 1.00 10 20 30 40 50
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 12 / 29
Statistical tests
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 13 / 29
Autoregressive moving average (ARMA) models
One of most common univariate time series models:
Exact log-likelihood can be evaluated via the Kalman filter, but the
“conditional” likelihood is easier and commonly used
statsmodels has tools for simulating ARMA processes with known
coefficients ai , bi and also estimation given specified lag orders
import scikits.statsmodels.tsa.arima_process as ap
ar_coef = [1, .75, -.25]; ma_coef = [1, -.5]
nobs = 100
y = ap.arma_generate_sample(ar_coef, ma_coef, nobs)
y += 4 # add in constant
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 14 / 29
ARMA Estimation
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 15 / 29
Vector Autoregression (VAR) models
Matrices Ai are K × K .
Yt must be a stationary process (sometimes achieved by
differencing). Related class of models (VECM) for modeling
nonstationary (including cointegrated) processes
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 16 / 29
Vector Autoregression (VAR) models
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 17 / 29
Vector Autoregression (VAR) models
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 18 / 29
Vector Autoregression (VAR) models
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 19 / 29
VAR: Impulse Response analysis
Analyze systematic impact of unit “shock” to a single variable
irf = result.irf(10)
irf.plot()
Impulse responses
m1 → m1 realgdp → m1 cpi → m1
1.0 0.2 0.4
0.8 0.1 0.3
0.2
0.6 0.0 0.1
0.4 0.1 0.0
0.2 0.2 0.1
0.2
0.0 0.3 0.3
0.20 0.4 10 0.40
2 m14→ realgdp
6 8 10 0 4 → realgdp
2 realgdp 6 8 2 cpi4→ realgdp
6 8 10
0.20 1.0 0.2
0.15 0.8 0.1
0.10 0.6 0.0
0.05
0.4 0.1
0.00
0.05 0.2 0.2
0.10 0.0 0.3
0.150 2 4 → cpi6 8 10 0.20 2 4 →6cpi 8 10 0.40 2 4cpi → cpi6 8 10
m1 realgdp
0.20 0.15 1.0
0.15 0.10 0.8
0.10 0.05 0.6
0.05 0.00
0.00 0.05 0.4
0.05 0.10 0.2
0.100 2 4 6 8 10 0.150 2 4 6 8 10 0.00 2 4 6 8 10
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 20 / 29
VAR: Forecast Error Variance Decomposition
Analyze contribution of each variable to forecasting error
fevd = result.fevd(20)
fevd.plot()
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 21 / 29
VAR: Statistical tests
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 22 / 29
Filtering
14
Inflation
12 Cyclical component
10 Trend component
8
6
4
2
0
2
4
2 6 0 4 8 2 6 0 4 8 2 6
196 196 197 197 197 198 198 199 199 199 200 200
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 23 / 29
Filtering
2 2
0 0
2 2
4 4
63
73
83
93
68
78
88
98
03
71
81
91
08
66
76
86
96
01
06
19
19
19
19
19
19
19
19
19
19
19
20
19
19
19
19
20
20
20
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 24 / 29
Preview: Bayesian dynamic linear models (DLM)
yt = Ft0 θt + νt , νt ∼ N (0, Vt )
θt = G θt−1 + ωt , ωt ∼ N (0, Wt )
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 25 / 29
Preview: DLM Example (Constant+Trend model)
model = Polynomial(2)
dlm = DLM(close_px[’AAPL’], model.F, G=model.G, # model
m0=m0, C0=C0, n0=n0, s0=s0, # priors
state_discount=.95) # discount factor
Constant + Trend DLM
200
150
100
50
8 9 9 009 009 9 9
200 200 200 2 Jul 2 200 200
Nov Jan Mar May Sep Nov
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 26 / 29
Preview: Stochastic volatility models
1.4
1.2
1.0
0.8
0.6
0.4
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 27 / 29
Future: sandbox and beyond
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 28 / 29
Conclusions
McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 29 / 29