Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
* in 15 minutes
What have I done?
Tools for Probabilistic Data Analysis in Python
Physics
Data
Physics
mean model
(physical parameters → predicted data)
Data
Physics
mean model
(physical parameters → predicted data)
Data
noise
(stochastic; instrument, systematics, etc.)
Physics
Data
noise
(stochastic; instrument, systematics, etc.)
A few examples
1 linear regression
2 maximum likelihood
3 uncertainty quantification
Linear regression
Linear regression
if you have:
a linear mean model and
known Gaussian uncertainties
y = mx + b
Linear (mean) models
y = mx + b
y = a2 x2 + a1 x + a0
Linear (mean) models
y = mx + b
y = a2 x2 + a1 x + a0
y = a sin(x + w)
Linear (mean) models
y = mx + b
y = a2 x2 + a1 x + a0
y = a sin(x + w)
import numpy as np
A = np.vander(x, 2)
ATA = np.dot(A.T, A / yerr[:, None]**2)
sigma_w = np.linalg.inv(ATA)
mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2))
Linear regression
0 1
x1 1
B x2 1 C
# x, y, yerr are numpy arrays of the same B
shape C
A=B . .. C
@ .. . A
import numpy as np
xn 1
A = np.vander(x, 2)
ATA = np.dot(A.T, A / yerr[:, None]**2)
sigma_w = np.linalg.inv(ATA)
mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2))
Linear regression
0 1
x1 1
B x2 1 C
# x, y, yerr are numpy arrays of the same B
shape C
A=B . .. C
@ .. . A
import numpy as np
xn 1
A = np.vander(x, 2)
ATA = np.dot(A.T, A / yerr[:, None]**2)
sigma_w = np.linalg.inv(ATA)
mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2))
✓ ◆
m
w=
b
Linear regression
0 1
x1 1
B x2 1 C
# x, y, yerr are numpy arrays of the same B
shape C
A=B . .. C
@ .. . A
import numpy as np
xn 1
A = np.vander(x, 2)
ATA = np.dot(A.T, A / yerr[:, None]**2)
sigma_w = np.linalg.inv(ATA)
mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2))
✓ ◆
m
w=
b
That's it!
(in other words: "Don't use MCMC for linear regression!")
Maximum likelihood
Maximum likelihood
if you have:
a non-linear mean model and/or
non-Gaussian/unknown noise
p(data | physics)
"probability of the data given physics"
XN 2
1 [yn f✓ (xn )]
ln p({yn } | ✓) = 2
+ constant
2 n=1 n
" "
2
Likelihoods
Likelihoods
SciPy
Likelihoods
import numpy as np
from scipy.optimize import minimize
def neg_log_like(theta):
return 0.5 * np.sum(((model(theta, x) - y) / yerr)**2)
XN
1 [yn f✓ (xn )]2
ln p({yn } | ✓) = 2
+ constant
2 n=1 n
Likelihoods
import numpy as np
from scipy.optimize import minimize
a
def model(theta, x): f✓ (xn ) = b (xn c)
a, b, c = theta 1 +e
return a / (1 + np.exp(-b * (x - c)))
def neg_log_like(theta):
return 0.5 * np.sum(((model(theta, x) - y) / yerr)**2)
XN
1 [yn f✓ (xn )]2
ln p({yn } | ✓) = 2
+ constant
2 n=1 n
Likelihoods
import numpy as np
from scipy.optimize import minimize
a
def model(theta, x): f✓ (xn ) = b (xn c)
a, b, c = theta 1 +e
return a / (1 + np.exp(-b * (x - c)))
XN
1 [yn f✓ (xn )]2
ln p({yn } | ✓) = 2
+ constant
2 n=1 n
"But it doesn't work…"
— everyone
1 initialization
2 bounds
3 convergence
4 gradients
1 initialization
2 bounds
3 convergence
4 gradients
Gradients
d
ln p({yn } | ✓)
d✓
seriously?
AutoDiff to the rescue!
1 Theano: deeplearning.net/software/theano
2 HIPS/autograd: github.com/HIPS/autograd
HIPS/autograd just works
import autograd.numpy as np
from autograd import elementwise_grad
def f(x):
y = np.exp(-x)
return (1.0 - y) / (1.0 + y)
df = elementwise_grad(f)
ddf = elementwise_grad(df)
HIPS/autograd just works
1.0
f (x); f (x); f (x)
0.5
import autograd.numpy as np
00
def f(x):
y = np.exp(-x)
0.0
0
0.5
df = elementwise_grad(f)
ddf = elementwise_grad(df)
1.0
4 2 0 2 4
x
before autograd
import numpy as np
from scipy.optimize import minimize
def neg_log_like(theta):
r = (y - model(theta, x)) / yerr
return 0.5 * np.sum(r*r)
print(r)
after autograd
def neg_log_like(theta):
r = (y - model(theta, x)) / yerr
return 0.5 * np.sum(r*r)
def neg_log_like(theta):
r = (y - model(theta, x)) / yerr
return 0.5 * np.sum(r*r)
def neg_log_like(theta):
r = (y - model(theta, x)) / yerr
return 0.5 * np.sum(r*r)
or...
Use Julia?
Uncertainty quantification
Uncertainty quantification
if you have:
a non-linear mean model and/or
non-Gaussian/unknown noise
SAMPLE
cbnd
Flickr user Franz Jachim
MCMC sampling
MCMC sampling
it's
ham
me
r tim
e!
emcee
The MCMC Hammer
MCMC sampling with emcee
dfm.io/emcee; github.com/dfm/emcee
MCMC sampling with emcee
import emcee
import numpy as np
def log_prob(theta):
log_prior = 0.0
r = (y - model(theta, x)) / yerr
return -0.5 * np.sum(r*r) + log_prior
ndim, nwalkers = 3, 32
p0 = np.array([1.0, 10.0, 1.5])
p0 = p0 + 0.01*np.random.randn(nwalkers, ndim)
sampler = emcee.EnsembleSampler(nwalkers, ndim, log_prob)
sampler.run_mcmc(p0, 1000)
MCMC sampling with emcee
a
f✓ (xn ) = b (xn c)
14
10 12
1+e
b
8 5
50 .52
1
0
c
1.
5
47
1.
8
5
0
5
0
10
12
14
5
97
00
02
05
47
50
52
0.
1.
1.
1.
1.
1.
1.
a b c made using:
github.com/dfm/corner.py
1 initialization
2 bounds
3 convergence
4 gradients
1 initialization
2 priors
3 convergence
4 gradients?
Other MCMC samplers in Python
1 pymc-devs/pymc3
2 stan-dev/pystan
3 JohannesBuchner/PyMultiNest
4 eggplantbren/DNest4
Other MCMC samplers in Python
hierarchical
1 pymc-devs/pymc3
inference
2 stan-dev/pystan
3 JohannesBuchner/PyMultiNest
4 eggplantbren/DNest4
Other MCMC samplers in Python
hierarchical
1 pymc-devs/pymc3
inference
2 stan-dev/pystan
3 JohannesBuchner/PyMultiNest
sampling
nested
4 eggplantbren/DNest4
in summary…
If your data analysis problem looks like this… *
Physics
Data
noise
(stochastic; instrument, systematics, etc.)
* it probably does
… now you know how to solve it! *
https://speakerdeck.com/dfm/pyastro16
* in theory