Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
IJSM
Vol. 5(1), pp. 108-118, September, 2018. © www.premierpublishers.org. ISSN: 2375-0499
Research Article
Forecasting Temperatures in Bangladesh: An Application
of SARIMA Models
Md. Siraj Ud Doulah
Department of Statistics, Begum Rokeya University, Rangpur, Bangladesh.
E-mail: sdoulah_brur@yahoo.com
Climate change is presently among the significant topics of discussion and temperature is one of
its main components. In this study, it is to be observed that the minimum temperature is more
fluctuating compared to the maximum temperature. Several suggested SARIMA models were
established for maximum and minimum temperature series according to the methods of the Box
Jenkin’s methodology. The best model for maximum temperature is SARIMA (1, 0, 0) (1, 1, 0) 12
and for minimum temperature is SARIMA (2, 0, 1) (2, 1, 0)12 selected based on AIC. From the model
validation outcomes, the projected values are well-fitted through the original data with the lower
and upper limits holding bulks of the original data. The detected models are therefore suitable to
be used for projecting monthly maximum and minimum temperature in Bangladesh. The selected
SARIMA models give two-year predicted monthly maximum and minimum temperatures that can
help decision makers to establish priorities for preparing themselves against forthcoming
weather fluctuations. The forecasts also display that the minimum temperature of Bangladesh will
continue with the upward trend. This is a reflection of a fluctuating climate in the entire country.
Keywords: Temperature, SARIMA, Validation, Forecasting, Bangladesh
INTRODUCTION
The most influential factors in the climate are temperature temperature. Each plant species has an optimal
and moisture. According to a study by Oluwafemi et al. temperature limit for its different stages of growth and
(2010), climate change seems to be one of the most functions which are described in (Syeda, 2012). They also
important issues in the recent two decades and have an upper and lower lethal limits between which they
temperature has been identified as one of the key can properly grow. Temperature determines which species
elements that can indicate climate change. The gradual can survive in a particular region. Several farmers are
rise in the mean temperature of the Earth’s atmosphere however unaware of the changing climate and are also
and its oceans is referred to as Global warming. It is widely ignorant of the adverse impacts it will have on their
believed that the changing temperature due to global livelihoods. High temperatures causes prolonged
warming is permanently changing the entire Earth’s droughts, affects the amount of water in the soil, affects
climate. For a long time, the biggest debate in a number of rainfall patterns and reduces water catchment areas. The
local and international forums worldwide has been whether increased temperatures can also cause an outbreak of
global warming is real which is described in (Nigar and pests and diseases that affects plants, animals and
Mahedi, 2015). Some people think that global warming is humans. Farmers who are aware of the changing climate
not real. However, several climate scientists have carried are also helpless and unaware of what to do. They
out researches and have come to a conclusion that the continue with poor agricultural practices like burning of
globe is gradually warming. People perceive the impacts wastes and poor disposal of unused fertilizers that worsen
of global warming differently with some taking the the situation by releasing greenhouse gasses to the
necessary precautions to help reduce the rates of the atmosphere. Studying temperature changes is thus vital
rising temperatures. Increase in temperatures are likely to for the Bangladesh economy as Agriculture which is the
lead to a global increase in drought conditions, decreased country’s largest source revenue is directly affected by the
water supplies due to evapotranspiration and an increase rising temperatures. The Bangladesh government derives
in urban and agricultural demand. Vital sectors of the nearly 20% of its revenue from the agricultural sector. As
Bangladesh economy like Agriculture greatly rely on the largest employer in the economy, the agricultural
climate. Plants can grow only within certain limits of sector accounts for about 50% of the country’s
employment. In addition, more than 70% of Bangladesh Seasonal ARIMA (SARIMA) Model
population living in rural areas depends on agricultural
related activities for their daily livelihoods. Climatic studies Gurudeo and Mahbub (2010) alludes that most natural
on temperature are therefore vital for the survival of the factors like temperature have strong seasonal
agricultural sector as the key source of revenue to the Components. It is therefore necessary to use
government of Bangladesh. autoregressive and moving average polynomials that
identify with the seasonal lags. One such model is the
Temperature is one of the key elements of climate and it is SARIMA model. SARIMA model is an extension of ARIMA
important to various sectors of the economy like model and it is applied when the series contains both
Agriculture. Temperature affects water sources, pests that seasonal and non-seasonal behavior. SARIMA model is
attack plants, animals and human diseases. Despite the sometimes called the multiplicative seasonal
increasing climate changes, majority of Bangladeshi autoregressive integrated moving average and is denoted
citizens are still not well informed. Analyzing and by SARIMA (p,d,q)(P,D,Q)S. The Seasonal AR can be
forecasting of temperature changes will thus help various written as:
stakeholders and government to plan in advance in order
to counter climate related disasters. The objective of this p ( B s ) yt t
research is to build a time series model and use this model The Seasonal MA can be written as
to analyze and forecast the variation in maximum and
minimum temperature in Bangladesh in order to inform
yt Q ( B s ) t
stakeholders who depend directly or indirectly on it to plan The seasonal differencing is expressed as
in advance. (1 B s ) yt yt yt s
Combining the above equations, we get SARIMA
METHODOLOGY p ( B) p ( B s )(1 B) d (1 B s ) D yt 0 q ( B)Q ( B s ) t
Average Maximum and Minimum Monthly temperature Where the constant equals
data covering Bangladesh has been collected from the 0 [(1 1 p )(1 1 p )]
Bangladesh Meteorological Department (BMD). This data
was recorded in monthly basis covering an 18 year period Where p represents non-seasonal AR order, d represents
from January 2000 to December 2017 (www.data.gov.bd). non seasonal differencing, q represents non seasonal MA
The temperatures are measured in degrees Celsius. The order, P represents seasonal AR order, D represents
temperature data is a continuous univariate time series as seasonal differencing, Q represents seasonal MA order, S
it contains a single variable (temperature) which is represents seasonal order (for monthly data S = 12 ) yt
measured at every instant of time. However, this data was represents time series data at period t, B is the backward
merged into monthly intervals transforming it to a discrete
univariate time series. shift operator ( B yt yt k ) and
k
t is the random shock
(white noise error).
The Box-Jenkins Method
Stationarity Analysis
This study follows the Box-Jenkins methodology for
modeling. The following conceptual framework proposed One of the important types of data used in empirical
by Box et al. (1976) is considered in this study. analysis is time series data. The empirical work based on
time series data assumes that the underlying time series
is stationary. The time series analysis based on the
stationary time series data. In this section we briefly
discuss on stationary and non-stationary time series. A
stochastic process is said to be stationary if its mean and
variance are constant over time. Otherwise it will be non-
stationary. Why are stationary time series so important?
Because if a time series is non-stationary, we can study its
behavior only for the time period under consideration.
Each set of time series data will therefore be for a
particular episode. As a consequence, it is not possible to
generalize it to other time periods. Therefore, for the
purpose of forecasting, such (non-stationary) time series
may be of little practical value. How do we know that a
particular time series is stationary? There are several tests
of stationary. Here we used graphical and analytical
recognized test. Graphical test: if we depend on common
Figure 1. Box- Jenkins ARIMA Model sense, it would seem that the time series depicted in figure
n
ck 1 ( yt )( yt k ) Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test
t 1
To be able to test whether we have a deterministic trend
If xt is a stationary process with mean μ, the vs stochastic trend, we are using KPSS (Kwiatkowski,
autocorrelation of order k is simply the relation between yt Phillips, Schmidt and Shin) Test (1992).
H 0 : Yt ~ I (0) Level (or trend) stationary
and yt k . The ACF estimate for the sample at lag k is thus
defined as H 1 : Yt ~ I (1) Difference stationary
E{( yt )( yt k )} STEP 1: Regress Yt on a constant and trend and
k
E{( yt )} construct the OLS residuals (1 , 2 , , T )
The PACF of a stationary process yt denoted hh is STEP 2: Obtain the partial sum of the residuals.
T
11 corr( yt 1 , yt ) (1) St t
p t 1
rp 1 pj rp 1 j STEP 3: Obtain the test statistic
ˆp 1, p 1 j 1
p 2,3, St2
T
p
, KPSS T 2
1 pj r j t 1 ̂ 2
j 1
where ̂ is the estimate of the long-run variance of the
2
Model Diagnostic
Forecasting is important in decision making process Table 1: Summary statistics of maximum and minimum
(Brockwell et al. 2002; Box et al. 1976). The chosen model temperatures
should therefore produce accurate forecasts. The selected Temperature Minimum Maximum Standard
Range Mean Variance
model does not always necessarily provide the best (°C) Value Value Deviation
Maximum 11.9 23.02 34.92 30.7 2.7 7.3
forecasting therefore it is important to apply other tests
Minimum 15.8 11.09 26.86 21.4 4.8 23.2
such as MAE, MSE and MAPE to confirm the forecasting
accuracy of the model.
From Table 1, it is observed that the minimum temperature
Forecasting an ARMA process with mean y , m-step- is more varying (standard deviation=4.8) compared to the
ahead forecasts can be defined as maximum temperature (standard deviation=2.7).
ynm y jn m j
j m
The precision of the forecast is assessed with a prediction
interval of the form
ynnm C Pnnm
2
Data Analysis
Minimum temperature
Figure 5: Autocorrelation and Partial Autocorrelation
function
Why are stationary time series so important? Because if a Table 2: Augmented Dickey – Fuller (ADF) Test
time series is non-stationary, we can study its behavior Dickey- Lag Critical p-
Temperatures
only for the time period under consideration. Each set of Fuller order Value value
time series data will therefore be for a particular episode. Maximum -2.8429 12 0.05 .2229
As a consequence, it is not possible to generalize it to Minimum -2.4787 12 0.05 .3755
other time periods. Therefore, for the purpose of
forecasting, such (non-stationary) time series may be of According to the ADF tests results for both maximum and
little practical value. How do we know that a particular time minimum temperature series shown in Table 2, we do not
series is stationary? There are several tests of stationary. reject the null hypothesis and conclude that the two series
Here we used graphical and analytical recognized test. are not stationary. This is because the more negative the
Dickey –Fuller is, the stronger the rejection of the null
hypothesis which is not the case here.
temperature series we do not reject the null hypothesis The best model is the one with the lowest value of AIC.
because the p-value of 0.1 ≥ 0.05 at 5% level of From Table 5, it is to be noted that the best model for
significance. Thus the maximum temperature is trend maximum temperature is SARIMA (1, 0, 0) (1, 1, 0)12 while
stationary. From the results, for minimum temperature for minimum temperature is SARIMA (2, 0, 1) (2, 1, 0) 12.
series we do not reject the null hypothesis because the p- The lowest value of AIC is 527.79 and the Ljung -Box test
value of 0.1 ≥ 0.05 at 5% level of significance. Thus the yielded a chi square of 7.0564 with a p value equal to
minimum temperature is trend stationary. 0.3157. From the Ljung -Box test, the p value of 0.3157 >
0.05 and this confirms that SARIMA (1, 0, 0) (1, 1, 0) 12 is
Model building for monthly temperature series adequate for forecasting of maximum temperature. The
lowest value of AIC is 505.45. and the Ljung -Box test yield
According to Shumway and Stoffer (2006), the process of a chi square of 3.349 with a p value of 0.7639. From the
model fitting involves data plotting, data transformation if Ljung -Box test, the p value of 0.7639 > 0.05 and this
necessary, Identification of dependence order, estimation confirms that SARIMA (2, 0, 1) (2, 1, 0)12 is adequate for
of parameter, diagnostic analysis and choosing forecasting of minimum temperature.
appropriate model. In this section, a univariate SARIMA
methodology is used to model maximum and minimum Parameter Estimation
monthly temperatures of Bangladesh.
Non-linear least-squares estimation or Maximum
Model Identification likelihood estimation methods are employed to estimate
ACF and PACF plots are used in the identification of the the coefficients of the models. A more complicated
values p, q, P and Q. For the non-seasonal part, spikes of iteration procedure is required when estimating the
the ACF at low lags are used to identify the value of q while parameters of SARMA models (Box et al. 1976; Chris
the value of p is identified by observing the spikes at low 2004).
lags of the PACF. For the seasonal part the value of Q is
observed from the ACF at lags that are multiples of S while Table 6: Select models Parameter Estimates Results
for P, the PACF is observed at lags that are multiples of S. Model SARIMA (1, 0, 0) (1, Model SARIMA (2, 0, 1) (2,
Looking at the ACF plots and PACF plots for maximum and 1, 0)12 for Maximum 1, 0)12 for Minimum
minimum differenced time series, the models are temperature temperature
suggested in the following Table 4. Parameter Estimate Std. Parameter Estimate Std.
error error
Table 4: Suggested Models AR(1) .3476 .0675 AR(1) -.6536 .0706
Maximum temperature Minimum temperature SAR(1) -.4498 .0633 AR(2) .2717 .0702
SARIMA (0, 0, 0) (0, 1, 0)12 SARIMA (2, 0, 2) (1, 1, 0)12 MA(1) 1.00 .0166
SARIMA (0, 0, 1) (0, 1, 0)12 SARIMA (0, 0, 0) (0, 1, 0)12 SAR(1) -.6165 .0689
SARIMA (1, 0, 0) (0, 1, 0)12 SARIMA (1, 0, 0) (1, 1, 0)12 SAR(2) -.3595 .0694
SARIMA (0, 0, 1) (1, 1, 0)12 SARIMA (2, 0, 1) (1, 1, 0)12
SARIMA (1, 0, 0) (1, 1, 0)12 SARIMA (2, 0, 1) (2, 1, 0)12 From Table 6 we observed that the models are estimated
Analyzing the aforementioned models for both Maximum well because of very low standard error of the estimated
& Minimum temperatures, the results of the estimated parameters.
models are shown in the following Table 5.
Diagnostic Analysis
Table 5: Suggested Models Estimation Results
Maximum temperature For a well fitted models, for maximum temperature is
Chi- SARIMA (1, 0, 0) (1, 1, 0)12 while for minimum temperature
Model P-value DF AIC
square is SARIMA (2, 0, 1) (2, 1, 0)12, the standardized residuals
SARIMA (0, 0, 0) (0, 1, 0)12 3.49e-07 40.582 6 592.35 estimated from the models should behave as an
SARIMA (0, 0, 1) (0, 1, 0)12 .03504 13.544 6 574.97 independently and identically distributed sequence with
SARIMA (1, 0, 0) (0, 1, 0)12 .174 8.993 6 569.15 zero mean and constant variance. Now the identification of
SARIMA (0, 0, 1) (1, 1, 0)12 .0963 10.754 6 532.07 normality is shown in the following figures-
SARIMA (1, 0, 0) (1, 1, 0)12 .3157 7.0564 6 527.79
Minimum temperature
Model P-value Chi- DF AIC
square
SARIMA (2, 0, 2) (1, 1, 0)12 .3899 6.3049 6 526.6
SARIMA (0, 0, 0) (0, 1, 0)12 .00153 21.412 6 557.52
SARIMA (1, 0, 0) (1, 1, 0)12 .3157 7.0564 6 527.79
SARIMA (2, 0, 1) (1, 1, 0)12 .3521 6.673 6 531.37
SARIMA (2, 0, 1) (2, 1, 0)12 .7639 3.349 6 505.45
shows that the residuals are white noise. The results also Model validation & Forecasting
showed that the residuals are non-significant with Ljung –
Box test p-value. From the above tests, it is clear that the In order to test the adequacy and predictive ability of the
fitted model is adequate since the residuals are white chosen models, the actual data sets, predicted values,
noise. That is, SARIMA (2, 0, 1) (2, 1, 0)12.is adequate for lower and upper limits are plotted and displayed in Figure
modeling the monthly minimum temperature series in 10 & 11. The graphs show that the predicted values are
Bangladesh. well-fitted through the original data with the lower and
upper limits containing majorities of the original data. This
indicates that the models chosen for maximum and
minimum temperature series are the best fitted ones for
the data sets.
From Table 7, it is to be remarkable that the observed values verses the predicted values as well as the noise residuals
that affirms the adequacy of the chosen maximum and minimum time series models.
Forecasting
Forecasting helps in planning and decision making process since it gives an insight of the future uncertainty using the
past and current behavior of given observations. Further accuracy tests such as MAE, MAPE and RMSE must therefore
be carried out on the model. The Table 8 shows a summary of ME, RMSE and MAE for both maximum and minimum
temperature models.
Here, it is to be forecasted the monthly temperature based on the selected models SARIMA (1, 0, 0) (1, 1, 0) 12 for maximum
temperature and SARIMA (2, 0, 1) (2, 1, 0)12 for minimum temperature. The outcomes are shown in the following Table 9.
CONCLUSIONS ACKNOWLEDGEMENTS
To sum up the whole discussion it is to be noteworthy that The author would like to thank the anonymous reviewers
for both maximum and minimum temperatures are for their helpful comments.
unsteady from month to month though the maximum
temperature is more fluctuating (standard deviation=2.7)
compared to the minimum temperature (standard REFERENCES
deviation=4.8). However, through the years, there is a
distinct increasing trend for minimum temperature proving Aidoo E (2010). Modeling and forecasting inflation rates in
the fact that global warming is in fact a reality. On the other Ghana: An application of SARIMA models. Högskolan
hand, maximum temperatures seem to be comparatively Dalarna, School of Technology and Business Studies.
steady. Both maximum and minimum temperature series Brockwell, Peter J, Davis RA (2002). Introduction to Time
exhibited high seasonality. The minimum temperature also Series and Forecasting, 2nd.edn. Springer-Verlang.
displayed an upward trend. The two series were made Box GEP, Jenkins GM, Reinsel GC (1976). Time Series
stationary. The best model for maximum temperature is Analysis, Forecasting and Control, 3rd edn. Prentice
SARIMA (1, 0, 0) (1, 1, 0)12 and for minimum temperature Hall, Englewood Clifs, NJ.
is SARIMA (2, 0, 1) (2, 1, 0)12. The model residuals for both Burnham KP, Anderson DR (1998). Model Selection and
series are near normality as most points fall on the straight Inference. Springer Verlag, New York.
line with a few close to it. From the model validation Corliss D (2009). Time series Analysis: an introduction
results, the projected values are well-fitted through the using Base SAS and STAT, NewYork, U.S.A.
original data with the lower and upper limits encompassing Chris C (2004). The Analysis of Time Series: An
bulks of the original data. The selected SARIMA models Introduction, John Wiley & Sons, NewYork, U.S.A.
give two-year projected monthly maximum and minimum Gurudeo AT, Mahbub I. (2010). “Time Series Analysis of
temperatures that can help decision makers to establish Rainfall and Temperature Interactions in Coastal
priorities for equipping themselves against upcoming Catchments”, J of Math. and Stat. 6 (3): 372-380.
weather changes. The forecasts also show that the Nigar S, Mahedi H. (2015). Forecasting Temperature in
minimum temperature of Bangladesh will continue with the the Coastal Area of Bay of Bengal-An Application of
upward trend. Box-Jenkins Seasonal ARIMA Model, Civil and Enviro.
R. 7(8): 149-159.