Sei sulla pagina 1di 8

A&A 496, 577584 (2009) Astronomy

DOI: 10.1051/0004-6361:200811296 &



c ESO 2009 Astrophysics

The generalised Lomb-Scargle periodogram


A new formalism for the floating-mean and Keplerian periodograms
M. Zechmeister and M. Krster

Max-Planck-Institut fr Astronomie, Knigstuhl 17, 69117 Heidelberg, Germany


e-mail: zechmeister@mpia.de
Received 5 November 2008 / Accepted 19 December 2008

ABSTRACT

The Lomb-Scargle periodogram is a common tool in the frequency analysis of unequally spaced data equivalent to least-squares fitting
of sine waves. We give an analytic solution for the generalisation to a full sine wave fit, including an oset and weights (2 fitting).
Compared to the Lomb-Scargle periodogram, the generalisation is superior as it provides more accurate frequencies, is less susceptible
to aliasing, and gives a much better determination of the spectral intensity. Only a few modifications are required for the computation
and the computational eort is similar. Our approach brings together several related methods that can be found in the literature, viz.
the date-compensated discrete Fourier transform, the floating-mean periodogram, and the spectral significance estimator used in the
SigSpec program, for which we point out some equivalences. Furthermore, we present an algorithm that implements this generalisation
for the evaluation of the Keplerian periodogram that searches for the period of the best-fitting Keplerian orbit to radial velocity data.
The systematic and non-random algorithm is capable of detecting eccentric orbits, which is demonstrated by two examples and can
be a useful tool in searches for the orbital periods of exoplanets.
Key words. methods: data analysis methods: analytical methods: statistical techniques: radial velocities

1. Introduction the mean of the data was subtracted, which assumes that the
mean of the data and the mean of the fitted sine function are the
The Lomb-Scargle periodogram (Scargle 1982) is a widely used same. One can overcome this assumption with the introduction
tool in period searches and frequency analysis of time series. It is of an oset c, resulting in a further generalisation of this peri-
equivalent to fitting sine waves of the form y = a cos t+b sin t. odogram to the equivalent of weighted full sine wave fitting; i.e.,
While standard fitting procedures require the solution of a set of y = a cos t + b sin t + c. Cumming et al. (1999), who called
linear equations for each sampled frequency, the Lomb-Scargle this generalisation floating-mean periodogram, argue that this
method provides an analytic solution and is therefore both con- approach is superior: ... the Lomb-Scargle periodogram fails to
venient to use and ecient. The equation for the periodogram account for statistical fluctuations in the mean of a sampled si-
was given by Barning (1963), and also Lomb (1976) and Scargle nusoid, making it non-robust when the number of observations
(1982), who furthermore investigated its statistical behaviour, is small, the sampling is uneven, or for periods comparable to
especially the statistical significance of the detection of a signal. or greater than the duration of the observations. These authors
For a time series (ti , yi ) with zero mean (y = 0), the Lomb- provided a formal definition and also a sophisticated statistical
Scargle periodogram is defined as (normalisation from Lomb treatment, but do not use an analytical solution for the computa-
1976): tion of this periodogram.
2 Basically, analytical formulae for a full sine, least-squares
1 YC
YS 2
p()
= + (1) spectrum have already been given by Ferraz-Mello (1981), call-
CC
YY SS ing this date-compensated discrete Fourier transform (DCDFT).
 2   We prefer to adopt a notation closely related to the Lomb-
1
i yi cos (ti )
) 2
i yi sin (ti Scargle periodogram calling it the generalised Lomb-Scargle pe-
=  2  +  (2)
y
i i i cos (t )
2
i isin (t )
2
i
riodogram (GLS). Shrager (2001) tries for such an approach but
did not generalise the parameter in Eq. (3). Moreover, our gen-
where the hats are used in this paper to symbolise the classical eralised equations, which are derived in the following (Sect. 2),
expressions. The parameter is calculated via have a comparable symmetry to the classical ones and also allow
 us to point out equivalences to the spectral significance estima-
sin 2ti
tan 2 =  i (3) tor used in the SigSpec program by Reegen (2007) (Sect. 4).
i cos 2ti
However, there are two shortcomings. First, the Lomb- 2. The generalised Lomb-Scargle periodogram
Scargle periodogram does not take the measurement errors into (GLS)
account. This was solved by introducing weighted sums by
Gilliland & Baliunas (1987) and Irwin et al. (1989) (equiva- The analytic solution for the generalised Lomb-Scargle peri-
lent to the generalisation to a 2 fit). Second, for the analysis odogram can be obtained in a straightforward manner in the

Article published by EDP Sciences


578 M. Zechmeister and M. Krster: The generalised Lomb-Scargle periodogram

same way as outlined in Lomb (1976). Let yi be the N mea- So the sums YC and YS use the weighted mean subtracted data
surements of a time series at time ti and with errors i . Fitting a and are calculated in the same way as for the Lomb-Scargle pe-
full sine function (i.e. including an oset c): riodogram (but with weights).
y(t) = a cos t + b sin t + c The generalised Lomb-Scargle periodogram p() in Eq. (4)
is normalised to unity and therefore in the range of 0 p 1,
at given frequency (or period P = 2 ) means to minimise with p = 0 indicating no improvement of the fit and p = 1 a
the squared dierence between the data yi and the model func- perfect fit (100% reduction of 2 or 2 = 0).
tion y(t): As the full sine fit is time-translation invariant, there is

N  also the possibility to introduce an arbitrary
 time reference
[yi y(ti )]2 point (ti ti ; now, e.g. CC = wi cos2 (ti )
=
2
=W wi [yi y(ti )]2 
i=1
2i ( wi cos (ti ))2 ), which will not aect the 2 of the fit. If
this parameter is chosen as
where
1 1    
2CS
wi = W = 12 wi = 1 tan 2 =
W 2i CC  SS  
i

wi sin 2ti 2 wi cos ti wi sin ti


are the normalised weights1 . Minimisation leads to a system of =     (19)
(three) linear equations whose solution is derived in detail in wi cos 2ti ( wi cos ti )2 ( wi sin ti )2
Appendix A.1. Furthermore, it is shown in Appendix A.1 that 
the relative 2 -reduction p() as a function of frequency and the interaction term in Eq. (5) disappears,
 CS = wi cos (ti
normalised to unity by 20 (the 2 for the weighted mean) can be ) sin (ti ) wi cos (ti ) wi sin (ti ) = 0 (proof
written as: in Appendix A.2) and in this case we append the index to the
time dependent sums. The parameter () is determined by the
2 2 ()
p() = 0 2 (4) times ti and the measurement errors i for each frequency . So
0 when using as defined in Eq. (19) the periodogram in Eq. (5)
1   becomes
p() = S S YC 2 + CC YS 2 2CS YC YS (5)  
YY D 1 YC2 YS 2
with: p() = + (20)
YY CC S S
D() = CC S S CS 2 (6)
Note that Eq. (20) has the same form as the Lomb-Scargle pe-
and the following abbreviations for the sums: riodogram in Eq. (1) with the dierence that the errors can be
 weighted (weights wi in all sums) and that there is an additional
Y = wi y i (7)
 second term in CC , S S , CS and tan 2 (Eqs. (13)(15)
C = wi cos ti (8) and (19), respectively) which accounts for the floating mean.
 The computational eort is similar as for the Lomb-Scargle
S = wi sin ti (9) periodogram. The incorporation of the oset c requires  only two
additionalsums for each frequency (namely S = wi sin ti

YY = YY
Y Y =
YY wi y2i (10) and C = wi cos ti or S and C respectively). The eort is
 even weaker when using Eq. (5) with keeping CS instead of us-
YC() = YC
Y C =
YC wi yi cos ti (11) ing Eq. (20) with the parameter introduced via Eq. (19) which
 needs an extra preceding loop in the algorithm. If the errors are
YS () = YS
Y S =
YS wi yi sin ti (12) taken into account as weights, also the multiplication with wi
 must be done.
CC() = CC
C C CC = wi cos2 ti (13) For fast computation of the trigonometric sums the algorithm

S S () = SS S S SS = wi sin2 ti (14) of Press & Rybicki (1989) can be applied, which has advan-
 tages in the case of large data sets and/or many frequency steps.
CS () = CS
C S CS = wi cos ti sin ti . (15) Another possibility are trigonometric recurrences2 as described
in Press et al. (1992). Note also that the first sum in S S can be
Note that sums with hats correspond to the classical sums. expressed by SS = 1 CC.
W YY 20 is simply the weighted sum of squared devia-
tions from the weighted mean. The mixed
sums can also be writ-
ten as a weighted covariance Cov x,y = wi xi yi X Y/W = 3. Normalisation and False-Alarm probability (FAP)
E(x y) WE(x)E(y) where E is the expectation value, e.g.
YS = Covy,sin t . There were several discussions in the literature on how to nor-
 malise the periodogram. For the detailed discussion we refer to
With the weighted mean given by y = wi y i = Y
Eqs. (10)(12) can also be written as: the key papers by Scargle (1982), Horne & Baliunas (1986),
 Koen (1990) and Cumming et al. (1999). The normalisation be-
YY = wi (yi y)2 (16) comes important for estimations of the false-alarm probability
 of a signal by means of an analytic expression. Lomb (1976)
YC() = wi (yi y) cos ti (17) 2 /CC
and
 showed that if data are Gaussian noise, the terms YC
2
YS () = wi (yi y) sin ti . (18) /SS in Eq. (1) are 2 -distributed and therefore the sum of
YS
1
For clarity the bounds of the summation are suppressed in the fol- 2
E.g. cos k+1 t = cos(k + )t = cos k t cos k t sin k t sin t
lowing notation. They are always the same (i = 1, 2, ..., N). where is the frequency step.
M. Zechmeister and M. Krster: The generalised Lomb-Scargle periodogram 579

both (which is p) is 2 -distributed with two degrees of free- Table 1. Probabilities that a periodogram power (Pn , p, P or z) can
dom. This holds for the generalisation in Eq. (20) and also be- exceed a given value (Pn,0 , p0 , P0 or z0 ) for dierent normalizations
comes clear from the definition of the periodogram in Eq. (4) (from Cumming et al. 1999).
2 2 ()
p() = 0 2 where for Gaussian noise the dierence in the
0 Reference level Range Probability
numerator 20 2 () is 2 -distributed with = (N1)(N3) = population variance Pn [0, ) Prob(Pn > Pn,0 ) = exp(Pn,0 )
2 degrees of freedom. sample variance p [0, 1]
N3
Prob(p > p0 ) = (1 p0 ) 2
The p() can be compared with a known noise level pn (ex- ! " N3
pected from the a priori known noise variance or population vari- P [0, N1
2
] Prob(P > P0 ) = 1 N12P 2

! " N3
ance) and the normalisation of p() to pn residual variance z [0, ) 2z0
Prob(z > z0 ) = 1 + N3 2

p()
Pn = (21)
pn

can be considered as a signal to noise ratio (Scargle 1982). Since in period search with the periodogram we study a
However, this noise level is often not known. range of frequencies, we are also interested in the significance of
Alternatively, the noise level may be estimated for Gaussian one peak compared to the peaks at other frequencies rather than
noise from Eq. (4) to be pn = N12
which leads to: the significance of a single frequency. The false alarm probabil-
ity (FAP) for the period search in a frequency range is given by
N1
P= p() (22) FAP = 1 [1 Prob(z > z0 )] M (24)
2
and is the analogon to the normalisation in Horne & Baliunas where M is the number of independent frequencies and may be
(1986)3. So if the data are noise, P = 1 is the expected value. estimated as the number of peaks in the periodogram. The width
If the data contains a signal, P  1 is expected at the signal of one peak is f T1 (frequency resolution). So in the fre-
frequency. However, this power is restricted to 0 P N1 2 . quency range f = f2 f1 there are approximately M = ff
But if the data contains a signal or if errors are under- or peaks and in the case f1 f2 one can write M T f2 (Cumming
overestimated or if intrinsic variability is present, then pn = N1
2
2004). Finally, for low FAP values the following handy approx-
may not be a good uncorrelated estimator for the noise level. imation for Eq. (24) is valid:
Cumming et al. (1999) suggested to estimate the noise level
a posteriori with the residuals of the best fit and normalised the FAP M Prob(z > z0 ) for FAP 1. (25)
periodogram as
Another possibility to access M and the FAP are
N 3 20 2 () N 3 p() Monte Carlo or bootstrap simulations in order to determine how
z() = = (23) often a certain power level (z0 ) is exceeded just by chance.
2 2best 2 1 pbest
Such numerical calculation of the FAP are much more time-
where the index best denotes the corresponding values of the consuming than the actual computation of the GLS.
best fit (pbest = p(best )).
Statistical fluctuations or a present signal may lead to a larger
periodogram power. To test for the significance of such a peak in 4. Equivalences between the GLS and SigSpec
the periodogram the probability is assessed that this power can (Reegen 2007)
arise purely from noise. Cumming et al. (1999) clarified that the
dierent normalisations result in dierent probability functions Reegen (2007) developed a method, called SigSpec, to deter-
which are summarised in Table 1. Note that the last two proba- mine the significance of a peak at a given frequency (spectral sig-
bility values are the same for the best fit (z0 = zbest ): nificance) in a discrete Fourier transformation (DFT) which in-
cludes a zero mean correction. We will recapitulate some points
 (N3)/2  (N3)/2 from his paper in an adapted and shortened way in order to show
2zbest pbest
Prob(z > zbest ) = 1 + = 1+ several equivalences and to disentangle dierent notations used
N3 1 pbest by us and used by Reegen (2007). For a detailed description
 (N3)/2 we refer to the original paper. Briefly, approaching from Fourier
1
= = (1 pbest )(N3)/2 theory Reegen (2007) defined the zero mean corrected Fourier
1 pbest coecients4
= Prob(p > pbest ) = Prob(P > Pbest ).
1  1  
aZM () = yi cos ti 2 yi cos ti
Furthermore Baluev (2008) pointed out that the power definition N N
1  1  
N2 2 bZM () = yi sin ti 2 yi sin ti
Z() = ln 2 0 N N
3 ()
3
These authors called it the normalization with the sample variance
as a nonlinear (logarithmic) scale for 2 has an exponential dis- 20 . Note that p() is already normalized with 20 . For the unweighted
tribution (similiar to Pn ) case (wi = N1 , 20 = N1
N
YY) one can write Eqs. (22) with (20) as P() =
 
1 N YC2 YS 2
2 2 2 CC
+ SS
.
Z () (N3)/2 4
0
Here only the unweighted case is discussed (wi = 1
Prob(Z > Zbest ) = e
= 2 = Prob(p > pbest ). N
). Reegen (2007)
0 also gives a generalization to weighting.
580 M. Zechmeister and M. Krster: The generalised Lomb-Scargle periodogram

which obviously correspond to YC and YS in Eqs. (11) and (12). the equivalence of the GLS and the spectral significance estima-
Their variances are given by tor in SigSpec (and with it to least squares) is evident:
# $
# $ y2  1 ! "2  sig(aZM , bZM|) =
YY N log e
% & p().
aZM = 2
2
cos ti
2
cos ti 2 y2
N N
# $ The two indicators dier #only $ in a normalisation factor, which
# $ y2  1 ! "2  becomes N1 log e, when y 2
is estimated with the sample vari-
bZM = 2
2
sin ti
2
sin ti . # $2
N N ance yi = N1 YY. Therefore, SigSpec gives accurate fre-
2 N

The precise value of these variances depends on the temporal quencies just like least squares. But note that the Fourier am-
# $ y2 plitude is given by the sum of the squared Fourier coecient
sampling. These variances can be expressed as a2ZM = N CC A2 = a2ZM + b2ZM = 42 + 42 YC 2 + YS 2 = YC2 + YS 2 while
# $ y2
YC2 YS 2
and b2ZM = N S S . the least-squares fitting amplitude is A2 = a2 + b2 = CC2 + S S 2

Consider now two independent Gaussian variables whose (see 6).
cumulative distribution function (CDF) is given by: The comparison with SigSpec oers another point of view.
 
2 2 It shows how the GLS is associated to Fourier theory and how it
12 +
(, |) = e .
2 2 can be derived from the DFT (discrete Fourier transform) when
demanding certain statistical properties such as simple statistical
A Gaussian distribution of the physical variable yi in the time behaviour, time-translation invariance (Scargle 1982) and vary-
domain will lead to Gaussian variables aZM and bZM which in ing zero mean. The shown equivalences allow vice versa to apply
general are correlated. A rotation of Fourier Space by phase 0 several of Reegens (2007) conclusions to the GLS, e.g. that it is
  
N sin 2ti 2 cos ti sin ti less susceptible to aliasing or that the time domain sampling is
tan 20 =    taken into account in the probability distribution.
N cos 2ti ( cos ti )2 + ( sin ti )2
transforms aZM and bZM into uncorrelated coecients and
with vanishing covariance. Indeed, from Eq. (19) and 0 have 5. Application of the GLS to the Keplerian
the same value, but is applied in the time domain, while 0 is periodogram (Keplerian fits to radial velocity
applied to the phase in the Fourier domain. It is only mentioned data)
here that the resulting coecients 2 and 2 correspond to YC
and YS . The search for the best-fitting sine function is a multidimen-
Finally, Reegen (2007) defines as the spectral significance sional 2 -minimisation problem with four parameters: period P,
sig(, |) := log (, |) and writes: amplitude A, phase and oset c (or frequency = 2/P, a,
 2 b and c). At a given frequency the best-fitting parameters A,
N log e aZM cos 0 + bZM sin 0 and c can be computed immediately by an analytic solution
sig(aZM , bZM|) = % 2 &
y 0 revealing the global optimum for this three dimensional param-
 2 eter subspace. But involving the frequency leads to a lot of local
aZM cos 0 bZM sin 0 optima (minima in 2 ) as visualised by the numbers of max-
+ (26)
0 ima in the generalised Lomb-Scargle periodogram. With step-
ping through frequency one can pick up the global optimum in
where the four dimensional parameter space. That is how period search
' % 2&
with the periodogram works. Because an analytic solution (im-
0 := 2N % 2 & plemented in the GLS) can be employed partially, there is no
y need for stepping through all parameters to explore the whole
( )
2   2 * parameter space for the global optimum.
= N cos 2 (t )
i 0 cos(t i 0 ) This concept can be transferred to the Keplerian peri-
N2
odogram which can be applied to search stellar radial veloc-
' % & ity data for periodic signals caused by orbiting companions and
2 measured from spectroscopic Doppler shifts. The radial velocity
0 := 2N % 2 &
y curve becomes more non-sinusoidal for a more eccentric orbit
( ) and its shape depends on six orbital elements5 :
2   2 *
= N sin2
(ti 0 ) sin(ti 0 ) constant system radial velocity
N2
are called normalised semi-major and semi-minor axes. Note K radial velocity amplitude
that 20 2CC and 20 2S S .  longitude of periastron
Reegen (2007) states that this gives as accurate frequencies e eccentricity
as do least squares. However, from this derivation it is not clear T0 periastron passage
that this is equivalent. But when comparing Eqs. (26) to (20)
with using YC = YC cos +YS sin and YS = YS cos P period.
YC sin : In comparison to the full sine fit there are two more param-

1 (YC cos + YS sin )2 eters to deal with which complicates the period search. An ap-
p() = proach to simplify the Keplerian orbit search is to use the GLS
YY CC

(YS cos YC sin )2 5
K = 2 a sin i
with a the semi-major axis of the stellar orbit and i the
+ P 1e2
inclination.
SS
M. Zechmeister and M. Krster: The generalised Lomb-Scargle periodogram 581

to look for a periodic signal and use it for an initial guess. But algorithm is needed for the computation of the Keplerian pe-
choosing the best-fitting sine period does not necessarily lead to riodogram which by definition yields the best fit at fixed fre-
the best-fitting Keplerian orbit. So for finding the global opti- quency and no local 2 -minima. OToole et al. (2007, 2009) de-
mum the whole parameter space must be explored. veloped an algorithm called 2DKLS (two dimensional Kepler
The Keplerian periodogram (Cumming 2004), just like Lomb-Scargle) that works on grid of period and eccentricity and
the GLS, shows how good a trial period (frequency ) can seems to be similar to ours. But the possibility to use partly an
model the observed radial velocity data and can be defined as analytic solution or the need for stepping T 0 is not mentioned by
(2 -reduction): these authors.
The eort to compute the Keplerian periodogram with the
20 2Kep () described algorithm is much stronger in comparison to the GLS.
pKep () = There are three additional loops: two loops for stepping through
20
e and T 0 and one for the iteration to solve Keplers equation.
Instead of the sine function, the function Contrary to the GLS it is not possible to use recurrences or the
fast computation of the trigonometric sums mentioned in Sect. 2.
RV(t) = + K[e cos  + cos((t) + )] (27) However we would like to outline some possibilities for tech-
nical improvements for a faster search. The first concerns the
which describes the radial reflex motion of a star due to the
grid size. We choose a regular grid for e, T 0 and . While this
gravitational pull of a planet, serves as the model for the ra-
is adequate for the frequency , as we discuss later in this sec-
dial velocity curve. The time dependence is given by the true
tion, there might be a more appropriate e T 0 grid, e.g. a less
anomaly which furthermore depends on three orbital parame-
dense grid size for T 0 at lower eccentricities. A second possibil-
ters ((t, P, e, T 0)). The relation between and time t is:
ity is to reduce the iterations for solving Keplers equation by
( using the eccentric anomalies (or dierentially corrected ones)
1+e E
tan = tan as initial values for the next slightly dierent grid point. This
2 1e 2 can save several ten percent of computation time, in particular in
t T0 dense grids and at high eccentricities. A third idea which might
E e sin E = M = 2 (28) have a high potential for speed up is to combine our algorithm
P
with other optimisation techniques (e.g. Levenberg-Marquardt
where E and M are called eccentric anomaly and mean anomaly, instead of pure stepping) to optimise the remaining parameters
respectively6. Equation (28), called Keplers equation, is tran- in the function pe,T 0 (). A raw grid would provide the initial val-
scendent meaning that for a given time t the eccentric anomaly E ues and the optimisation technique would do the fine adjustment.
cannot be calculated explicitly. To give an example Fig. 1 shows RV data for the M dwarf
For the computation of a Keplerian periodogram 2 is to be GJ 1046 (Krster et al. 2008) along with the best-fitting
minimised with respect to five parameters at a fixed trial fre- Keplerian orbit (P = 168.8 d, e = 0.28). Figure 2 shows the
quency . Similar to the GLS there is no need for stepping Lomb-Scargle, GLS and Keplerian periodograms. Because a
through all parameters. With the substitutions c = + Ke cos , Keplerian orbit has more degrees of freedom it always has the
a = K cos  and b = K sin  Eq. (27) can be written as: highest 2 reduction (0 pLS pGLS pKep,e<0.6 pKep 1).
As a comparison the Keplerian periodogram restricted to
RV(t) = c + a cos (t) + b sin (t)
e < 0.6 is also shown in Fig. 2. At intervals where pKep ex-
and with respect to the parameters a, b and c the analytic solution ceeds pe<0.6 the contribution is due to very eccentric orbits. Note
can be employed as described in Sect. 2 for known (instead that the Keplerian periodogram obtains more structure when the
of t). So for fixed P, e and T 0 the true anomalies i can be search is extended to more eccentric orbits. Therefore the evalu-
calculated and the GLS from Eq. (5) (now using i instead of ti ) ation of the Keplerian periodogram needs a higher frequency res-
can be applied to compute the maximum 2 -reduction (p()), olution (this eect can also be observed in OToole et al. 2009).
here called pe,T 0 (). Stepping through e and T 0 yields the best This is a consequence of the fact that more eccentric orbits are
Keplerian fit at frequency : spikier and thus more sensitive to phase and frequency.
Other than OToole et al. (2009) we plot the periodograms
pKep () = max pe,T 0 () against frequency7 to illustrate that the typical peak width f is
e,T 0
frequency independent. Thus equidistant frequency steps (d f <
as visualised in the Keplerian periodogram. Finally, with step- f ) yield a uniform sampling of each peak and are the most eco-
ping through the frequency, like for the GLS, one will find the nomic way to compute the periodogram rather than e.g. loga-
best-fitting Keplerian orbit having the overall maximum power: rithmic period steps (leading to oversampling at long periods:
d f = d P1 = P12 dP = P1 d ln P) as used by OToole et al. (2009).
pKep (best ) = max pKep (). Still the periodograms can be plotted against a logarithmic pe-

riod scale as e.g. sometimes preferred to present a period search
There exist a series of tools (or are under development) us- for exoplanets.
ing genetic algorithms or Bayesian techniques for fast searches Figure 4 visualises local optima in the pe,T 0 map at an arbi-
for the best Keplerian fit (Ford & Gregory 2007; Balan & Lahav trary fixed frequency. There are two obvious local optima which
2008). The algorithm, presented in this section, is not further means that searching from only one initial value for T 0 may be
optimised for speed. But it works well, is easy to implement not sucient as one could fail to lead the best local optimum
and is robust since in principle it cannot miss a peak if the 3 di- in the e T 0 plane. This justifies a stepping through e and T 0 .
mensional grid for e, T 0 and is suciently dense. A reliable The complexity in the e T 0 plane, in particular at high eccen-
tricities, finally translates into the Keplerian periodogram. When
1e2 sin E
6
The following expressions are also used frequently: sin = 1e cos E
and cos = 1e
cos Ee
cos E
. 7
For Fourier transforms this is common.
582 M. Zechmeister and M. Krster: The generalised Lomb-Scargle periodogram

2000 1 0.6
0.9
0.8 0.5
1000
0.7

Eccentricity e
0.4
RV [m/s]

Power p
0.6
0 0.5 0.3
0.4
0.3 0.2
-1000
0.2 0.1
0.1
-2000 0 0
3200 3400 3600 3800 4000 4200 0 90 180 270 360
Periastronpassage T0 [deg]
Barycentric Julian Date BJD - 2 450 000
Fig. 4. Power map (pe,T 0 ) for e and T 0 at the arbitrary fixed frequency
Fig. 1. The radial velocity (RV) time series of the M dwarf GJ 1046. f = 0.00422 d1 . The maximum value p = 0.592 is deposited in the
The solid line is the best Keplerian orbit fit (P = 168.8 d, e = 0.28). Keplerian periodogram. Note that there are the two local optima.

1
pKep
0.8
pKep, e<0.6 analogous to Eq. (23) and derived the probability distribution
pGLS
pLS    N3
N 3 4z0
Power p

0.6 4z0 2
Prob(z > z0 ) = 1 + 1+ .
2 N5 N5
0.4
With this we can calculate that the higher peak which is much
0.2
closer to 1 has a 1014 times lower probability to be due to noise,
i.e. it has a 1014 times higher significance. In Fig. 3 the peri-
0
0.002 0.003 0.004 0.005 0.006 0.007 odogram power is plotted on a logarithmic scale for 2 on the
Frequency f [1/d] right-hand side. The much lower 2 is another convincing argu-
ment for the much higher significance.
Fig. 2. Comparison of the normalized Lomb-Scargle (LS), GLS and Cumming (2004) suggested to estimate the FAP for the pe-

Keplerian periodograms for GJ 1046 ( f = P1 = 2 ). riod search analogous to Eq. (25) and the number of independent
frequencies again as M T f . This estimation does not take
Period [d] into account the higher variability in the Keplerian periodogram,
500 400 300 200 150 which depends on the examined eccentricity range, and therefore
.99999 10 this FAP is likely to be underestimated.
pKep Another, more extreme example is the planet around
pKep, e<0.6 100
.9999 pGLS HD 20782 discovered by Jones et al. (2006). Figure 5 shows
pLS the RV data for the star taken from OToole et al. (2009). Due
Power p

.999 1E3
to the high eccentricity this is a case where LS and GLS fail to
2

.99 1E4 find the right period. However, our algorithm for the Keplerian
1E5
periodogram find the same solution as the 2DKLS (P = 591.9 d,
.9 e = 0.97). The Keplerian periodogram in Fig. 6 indicates this pe-
0. 1E6
20
riod. This time it is normalised according to Eq. (29) and seems
0.002 0.003 0.004 0.005 0.006 0.007 to suer from an overall high noise level (caused by many other
Frequency f [1/d] eccentric solutions that will fit the one outlier). However, that
the period is significant can again be shown just as in the previ-
Fig. 3. The same periodograms as in Fig. 2 with a quasi-logarithmic ous example.
scale for p and a logarithmic scale for 2 (axis to the right). For comparison we also show the periodogram with the nor-
malisation by the best fit at each frequency (Cumming 2004)

varying the frequency the landscape and maxima will evolve and N 5 20 2 ()
zKep () = (30)
the maxima can also switch. 4 2 ()
In the given example LS and GLS would give a good initial
guess for the best Keplerian period with only a slight frequency which is used in the 2DKLS and reveals the power maximum as
shift. But this is not always the case. impressively as in OToole et al. (2009, Fig. 1b). As Cumming
One may argue, that the second peak has an equal height sug- et al. (1999) mentioned the choice of the normalisation is a mat-
gesting the same significance. On a linear scale it seems so. But ter of taste; the distribution of maximum power is the same.
the significance is not a linear function of the power. Cumming Finally, keep in mind when comparing Fig. 6 with the 2DKLS
et al. (2008) normalised the Keplerian periodogram as in OToole et al. (2009, Fig. 1b) which shows a slice at e = 0.97,
that the Keplerian periodogram in Fig. 6 includes all eccentric-
ities (0 e 0.99). Also the algorithms are dierent because
N 5 20 2 () N 5 pKep () we also step for T 0 and simultaneously fit the longitude of peri-
zKep () = = (29)
4 2best 4 1 pKep (best ) astron .
M. Zechmeister and M. Krster: The generalised Lomb-Scargle periodogram 583

100 sinusoidal functions of the kind: y(t) = aZ(t) cos t+bZ(t) sin t
with an arbitrary amplitude modulation Z(t) whose time de-
0 pendence and all parameters are fully specified (e.g. Z(t) can
be an exponential decay). Without repeating the whole proce-
RV [m/s]

-100 dure given in 6, 6 and Sect. 2 it is just mentioned here that


the generalisation to y(t) = aZ(t) cos t + bZ(t) sin t + c will
result in the same equations with the dierence that Z(ti ) has
-200
to be attached to each sine and cosine term in each sum (e.g.
CC =  wi Z(ti ) cos ti Z(ti ) cos ti ).
-300 We presented an algorithm for the application of GLS to the
Keplerian periodogram which is the least-squares spectrum for
1000 1500 2000 2500 3000 3500 4000 4500 Keplerian orbits. It is an hybrid algorithm that partially applies
Barycentric Julian Date BJD - 2 450 000 an analytic solution for linearised parameters and partially steps
through non-linear parameters. This has to be distinguished from
Fig. 5. The radial velocity (RV) time series of HD 20782. The solid line methods that use the best sine fit as an initial guess. With two
is the best Keplerian orbit fit.
examples we have demonstrated that our algorithm for the com-
putation of the Keplerian periodogram is capable to detect (very)
Period [d]
eccentric planets in a systematic and nonrandom way.
5000 1000 500 300 200 150
800 Apart from this, the least-squares spectrum analysis (the idea
700 goes back to Vancek 1971) with more complicated model func-
600 tions than full sine functions is beyond the scope of this paper
500 (e.g. including linear trends, Walker et al. 1995 or multiple sine
Power z

400 functions). For the calculation of such periodograms the employ-


300 ment of the analytical solutions is not essential, but can be faster.
200
100
0
Appendix A
0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 A.1. Derivation of the generalised Lomb-Scargle periodogram
Frequency f [1/d] (GLS)

Fig. 6. Keplerian periodogram for HD 20782. Both are the same The derivation of the generalised Lomb-Scargle periodogram is
Keplerian periodogram, but the upper one is normalized with the best briefly shown. With the sinusoid plus constant model
fit (Eq. (29)) while the lower one is normalized with the best fit at each
frequency (Eq. (30)). Both have by definition the same maximum value. y(t) = a cos t + b sin t + c
the squared dierence between the data yi and the model func-
6. Conclusions tion y(t)

Generalised Lomb-Scargle periodogram (GLS), floating-mean 2 = W wi [yi y(ti )]2
periodogram (Cumming et al. 1999), date-compensated discrete
Fourier transform (DCDFT, Ferraz-Mello 1981), and spectral is to be minimised. For the minimum 2 the partial derivatives
significance (SigSpec, Reegen 2007) at last all mean the same vanish and therefore:
thing: least-squares spectrum for fitting a sinusoid plus a con- 
stant. Cumming et al. (1999) and Reegen (2007) have already 0 = a 2 = 2W wi [yi y(ti )] cos ti (A.1)
shown the advantages of accounting for a varying zero point and 
therefore we recommend the usage of the generalised Lomb- 0 = b 2 = 2W wi [yi y(ti )] sin ti (A.2)

Scargle periodogram (GLS) for the period analysis of time se- 0 = c 2 = 2W wi [yi y(ti )]. (A.3)
ries. The implementation is easy as there are only a few modifi-
cations in the sums of the Lomb-Scargle periodogram. These conditions for the minimum give three linear equations:
The GLS can be calculated as conveniently as the Lomb-
Scargle periodogram and in a straight forward manner with an YC CC CS C a
analytical solution while programs applying standard routines YS SS S b
= CS

for fitting sinusoids involve solving a set of linear equations by Y C S 1 c
inverting a 3 3 matrix repeated at each frequency. The GLS
can be tailored by concentrating the sums in one loop over the where the abbreviations in Eqs. (7)(15) were applied.
data. As already mentioned by Lomb (1976) Eq. (5) (including Eliminating c in the first two equations with the last equation
Eqs. (13)(18)) should be applied for the numerical work. A fast (c = Y aC bS ) yields:
calculation of the GLS is especially desirable for large samples,     
Y C
YC C C CS
CC C S a
= C S SS S S b .
large data sets and/or many frequency steps. It also may be help-
Y S
YS CS
ful to speed up prewhitening procedures (e.g. Reegen 2007) in
case of multifrequency analysis or numerical calculations of the Using again the notations of Eqs. (11)(15) this can be writ-
significance of a signal with bootstrap methods or Monte Carlo ten as:
simulations.     
The term generalised Lomb-Scargle periodogram has al- YC CC CS a
ready been used by Bretthorst (2001) for the generalisation to = .
YS CS S S b
584 M. Zechmeister and M. Krster: The generalised Lomb-Scargle periodogram

So the solution for the parameters a and b is Expanding the last term yields
YC S S YS CS YS CC YC CS 2CS = 2CS
cos 2 (CC SS ) sin 2
a= and b = (A.4)  
D D 2 C S (cos2 sin2 ) (C 2 S 2 ) sin cos
The amplitude
of the best-fitting sine function at frequency is
given by a2 + b2 . With these solutions the minimum 2 can be and after factoring cos 2 and sin 2:
written only in terms of the sums Eqs. (10)(15) when eliminat- ! "
2CS = 2(CS C S ) cos 2 CC SS (C 2 S 2 ) sin 2
ing the parameters a, b, and c as shown below. With the condi-
tions for the minimum Eqs. (A.1)(A.3) it can be seen that: = 2CS cos 2 (CC S S ) sin 2
 
wi [yi y(ti )]y(ti ) = a wi [yi y(ti )] cos ti So for CS = 0, = has to be chosen as:

2CS
+b wi [yi y(ti )] sin ti tan 2 =
 CC S S
+c wi [yi y(ti )]
By the way, replacing the generalised sums CC, S S and CS by
= 0. the classical ones leads to the original definition for in Eq. (3):

Therefore, the minimum 2 can be written as:
2CS sin 2ti
  tan 2 = =
CC S S
cos 2ti
2 ()/W = wi [yi y(ti )]yi wi [yi y(ti )]y(ti )
+,-.
=0
= aYC
YY bYS cY
= YY Y Y a(YC
Y C) b(YS
Y S) References
= YY aYC bYS Balan, S. T., & Lahav, O. 2008, ArXiv e-prints, 805
Baluev, R. V. 2008, MNRAS, 385, 1279
where in the last step again the definitions of Eqs. (10)(12) Barning, F. J. M. 1963, Bull. Astron. Inst. Netherlands, 17, 22
were applied. Finally a and b can be substituted by Eq. (A.4): Bretthorst, G. L. 2001, in Bayesian Inference and Maximum Entropy Methods in
Science and Engineering, ed. A. Mohammad-Djafari, Am. Inst. Phys. Conf.
S S YC 2 CC YS 2 CS YC YS Ser., 568, 241
2 ()/W = YY +2 . Cumming, A. 2004, MNRAS, 354, 1165
D D D Cumming, A., Marcy, G. W., & Butler, R. P. 1999, ApJ, 526, 890
Cumming, A., Butler, R. P., Marcy, G. W., et al. 2008, PASP, 120, 531
When now using the 2 -reduction normalised to unity: Ferraz-Mello, S. 1981, AJ, 86, 619
Ford, E. B., & Gregory, P. C. 2007, in Statistical Challenges in Modern
20 2 () Astronomy IV, ed. G. J. Babu, & E. D. Feigelson, ASP Conf. Ser., 371, 189
p() = Gilliland, R. L., & Baliunas, S. L. 1987, ApJ, 314, 766
20 Horne, J. H., & Baliunas, S. L. 1986, ApJ, 302, 757
Irwin, A. W., Campbell, B., Morbey, C. L., Walker, G. A. H., & Yang, S. 1989,
and the fact that 20 = W YY, Eq. (5) will result. PASP, 101, 147
Jones, H. R. A., Butler, R. P., Tinney, C. G., et al. 2006, MNRAS, 369, 249
Koen, C. 1990, ApJ, 348, 700
A.2. Verification of Eq. (19) Krster, M., Endl, M., & Reert, S. 2008, A&A, 483, 869
Lomb, N. R. 1976, Ap&SS, 39, 447
Equation (19) can be verified with the help of trigonometric OToole, S. J., Butler, R. P., Tinney, C. G., et al. 2007, ApJ, 660, 1636
addition theorems. For this purpose CS must be formulated. OToole, S. J., Tinney, C. G., Jones, H. R. A., et al. 2009, MNRAS, 392, 641
Press, W. H., & Rybicki, G. B. 1989, ApJ, 338, 277
Furthermore, the index and the notation = will be used: Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P.
 1992, Numerical recipes in FORTRAN. The art of scientific computing
2CS = wi sin 2(ti ) (Cambridge: University Press), 2nd edn.
  Reegen, P. 2007, A&A, 467, 1353
2 wi cos(ti ) wi sin(ti ) Scargle, J. D. 1982, ApJ, 263, 835
  Shrager, R. I. 2001, Ap&SS, 277, 519
= cos 2 wi sin 2ti sin 2 wi cos 2ti Vancek, P. 1971, Ap&SS, 12, 10
Walker, G. A. H., Walker, A. R., Irwin, A. W., et al. 1995, Icarus, 116, 359
2(C cos + S sin )(S cos C sin ).

Potrebbero piacerti anche