Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Automation in Construction
j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / a u t c o n
Fluor Canada, 55 Sunpark Plaza SE, Calgary, Alberta T2X 3R4, Canada
Department of Civil and Environmental Engineering, University of Alberta 3-014 Markin/CNRL Natural Resources Engineering Facility, Edmonton, Alberta T6G 2M7, Canada
a r t i c l e
i n f o
Article history:
Accepted 21 November 2008
Keywords:
Forecasting
Infrastructure construction
Simulation modelling
Time-series analysis
Project estimation
a b s t r a c t
This paper presents BoxJenkins methods used to develop a forecasting methodology that can be integrated
within a simulation modelling environment. The resulting tool can be used by project managers in the capital
planning of large-scale infrastructure construction projects. The advantages of the BoxJenkins method with
respect to accuracy, systemization, and error dependency are outlined in relation to this type of project,
generally manifested in a time-series model. As well, the paper sets out the relevant algorithms involved in
BoxJenkins forecast modelling, and the relationship among them. In order to adapt BoxJenkins
methodologies for engineering problems, classical BoxJenkins methods must be simplied and then
automated and integrated with the aid of computer technology. The paper then details the development of
the integrated modelling template using MATLAB, tracking the necessary elements of the program.
2008 Elsevier B.V. All rights reserved.
1. Introduction
Civil projects present a problem to the construction industry due to
their lengthy execution and because the risk and uncertainty of these
projects exceed those of other project types. Transportation systems,
which require large amounts of funding to construct and maintain,
and which last for a long time period, are the most frequently studied
items in capital planning. In view of shrinking infrastructure budgets
(as documented by Hastak and Baim [5]), the capital planning of
infrastructure systems relies heavily on the prediction and estimation
of future outcomes. Uncertainty associated with such factors as future
transportation needs, future economic developments, and future
market condition variations must be estimated in sufcient detail to
ensure that related planning can be founded upon a realistic and
reasonable base. Forecasting methodologies are a valuable tool to
construction managers for better understanding future outcomes.
However, complex project environments demand an accurate and
adaptable method for performing expected risk and uncertainty
analysis, a need which is not satised by existing forecasting methods.
There are several challenges involved in forecasting and estimation
for capital planning. Canada et al. [3] mention three: rst, the unavoidability of forecasting of critical elements associated with the
development of a project; second, the inherent uniqueness of each
project, which makes the process of estimating unique to each project;
and third, limited information available from the previous forecasting work. However, information about past outcomes in similar
and related projects may be used to estimate potential outcomes.
Corresponding author.
E-mail address: abourizk@ualberta.ca (S.M. AbouRizk).
0926-5805/$ see front matter 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.autcon.2008.11.007
548
2t 2
: : : 1
where:
Zt
the present stationary observation
Zt 1, Zt 2,, t 1, t 2, the past observations and forecasting error
for the stationary time series (not original time series)
t
the present forecasting error (the expected value is set
equal to 0).
Depending on the complexity of the real system under analysis, the
above ARIMA function may be formulated as simple or complicated,
which means that less or more past observations or forecast errors
will be included in order to calculate the current condition.
After identifying the above function for a specic problem and
estimating all potential model parameters, we arrive at a deterministic
relationship function. Moving the time one unit ahead, we obtain a
relationship among future conditions and current available information. Thus, the tted ARIMA model function can be used to forecast
future outcomes.
An estimated future value comes from a calculation using stationary
time-series data. Therefore, this is an estimate and not a real future
value. To retrieve the original value, a series of back-transformations
must be performed. These operations mirror the operations of former
transformation endeavours.
In the ARIMA model function t, the present forecasting error will
be transformed to t + 1 when the time point is moved one unit ahead.
In this situation, it is assumed that the error is approximately equal to
zero to obtain point estimate value. Therefore, for the calculation of
future value, one must replace t + 1 with zero.
Y
T
t
i
j
p
q
In theory, BoxJenkins models can only process stationary timeseries data. A stationary time-series is a series of univariate data that
does not contain trends or seasonality; that is, it uctuates around a
constant mean. However, the original data collected from real job
sites are virtually non-stationary. Therefore, certain transformation
methods should be applied in order to transform original data into
stationary data. Frequently used transformation methods for engineering problems include n-order normal differencing, where n represents the order of differencing; and n-order seasonal differencing,
where n represents the order of seasonal differencing. Natural
3.1. Issues
549
These ideas can be implemented in an integrated way by developing the necessary algorithms within a Simphony SPS template.
Please see Appendix A for the detailed codes used for BoxJenkins
forecasting and simulation in the application. This template contains
all the functional elements for the different available modelling procedures. The expected integrated and automatic BoxJenkins forecasting modelling can then be realised.
4. Development of the integrated forecasting system
In general, this system employs the powerful computing capabilities of
MATLAB-based computer programs in order to realise the algorithms
associated with BoxJenkins methodologies. These programs are integrated into a computer modelling tool, which is represented by a SPS
simulation template in Simphony. All the elements in this template can be
used together or independently to accommodate specic modelling
requirements. Using the automated forecasting system, managers can
reduce the total time and cost spent on the complete modelling process
and dramatically decrease subsequent forecasting work.
4.1. Core algorithms and program development
The required algorithms associated with a complete execution of
BoxJenkins forecasting modelling are identied as follows:
550
cedure if a special integration system is developed. Based on the foundational ideas of a proposed SPS template discussed above, the following
simplications and modications to rational modelling methodologies
have been implemented to further the template's development.
551
If the t ratio for a parameter is signicantly greater than a predetermined critical value (usually N2, for =0.05), it is retained. Otherwise,
the ratio ought not to be employed and the model requires a recalculation
using the remaining terms. This condition checking is implemented by a
developed MATLAB program parameters estimation.
4.2.3.5. The model should be presented in its simplest form. Past
research and application studies have proven that the best BoxJenkins
model always has the fewest number of parameters. This paradigm
arises not only due to the ease in tting and explanation of simple
models, but also since minimal models avoid the problem of parameter
redundancy.
This condition applies when a traditional individual model identication approach is used to select ARIMA models. Since in this case only
15 low-order ARIMA models are chosen for real data tting and for
parameter estimating, and since these models are already in simplied
form, a diagnostic check of this condition is not needed.
4.2.3.6. The model should have a small root mean square error (RMSE).
Several good models can be generated even after all of the above checks
are completed. To identify the best of them, one should check the RMSE.
The model with the smallest RMSE is the best model.
SSE =
n
X
t =1
et
552
RMSE =
s
p
SSE
= MSE
n np
where:
n
np
et
Please see Appendix B, the SPS template User Manual, for a detailed
description of the elements, and their various data inputs and outputs.
4.3.2. Algorithm design and reference
As mentioned above, each of the ve elements has a specic function.
The elements can be used all together to constitute a BoxJenkins
Forecasting Model for performing an automatic modelling process. The
goal of this research was to design a template to realise these
functionalities and to facilitate the communication and cooperation
among these elements.
In Simphony SPS template development, Simphony uses Sax Basic as
its programming language. This language has a similar syntax to Visual
Basic, another advanced programming language; thus in theory, any
functionality can be realised through proper Visual Basic programming.
In order to realise the functionalities for automatic BoxJenkins
forecasting modelling, the programmer must design programs for each
of the algorithms summarised in the previous sections. The computerised design of algorithms proves to be difcult even for the
professional mathematician. In this research, therefore, we adopted a
straightforward process for designing algorithms using MATLAB
programming language and integrating these algorithms into a Simphony SPS template. The most difcult stage of the programming can be
performed by a third-party tool. The remainder of the work involves
seeking a means of referencing these algorithm programs in Simphony
and discovering how to facilitate the communication among them.
The solution for these problems is to package and compile all the
MATLAB algorithms into a third-party Type Library le and design the
Simphony codes to refer to these library les for the realisation of
these algorithms (see Appendix A). The implementation of this solution
is not as difcult as one initially supposed, due to the powerful toolbox
offered in MATLAB 13.0, which has the exact functionality required for
packaging and compiling algorithm programs. MATLAB COM Builder, a
special toolbox used for the creation of stand-alone computing programs, was provided with MATLAB 13 Version 0. This toolbox uses a
simple operation interface to facilitate complicated COM le compiling
process, in order to enable the non-professional programmer to create
COM les in a professional way.
The MATLAB COM Builder toolbox may be used to package these
algorithms les into one deployable COM le. All the algorithms, after
compiling, will become built-in functions to which Simphony devel-
553
opment codes will refer. Detailed instructions for using this toolbox
and for referencing the compiled COM les for the Visual Basic
programming environment are provided in the MATLAB 13.0 Products
Help les.
4.3.3. Template development and programming
The SPS template for automatic BoxJenkins forecasting modelling
was developed with the help of the Simphony Developer's Guide and
Simphony User's Guide. All programming codes comply with Sax
Basic programming standards, which are similar to those in Visual
Basic programming. Interested readers are asked to refer to the SPS
template for detailed coding and explanations (Appendix A).
In order to facilitate communication and cooperation between
different elements in this template, certain public variables are set up
for storing the interim results from each element. These results occur
when a certain kind of operation or calculation is performed. The
results are retrieved from these public variables when needed by a
successor element for its modelling procedure.
4.3.4. Validation
For validating the model, a set of data was identied and selected
from the website of the U.S. Census Bureau, Construction Section
(http://www.census.gov/const/www/). We selected data from the U. S.
Federal Construction Spending on Non-residential Projects, which
focused primarily on infrastructure system spending. These data were
collected monthly from January 1993 until July 2003 without seasonal
adjustments.
The monthly data from January 1993 to July 2003 were organized as
time-series data, situating the most recent data set collected (July 2003)
as the last data point and the oldest data collected (Jan 1993) as the rst
point. In this way, an original time-series can be generated. Based on the
graphical observation of original time-series data, one identied a strong
seasonality that appeared to control, to a certain degree, the uctuation of
monthly spending along a time axis. A slight upward trend is also present
which may point to a continuous increase in the federal construction
investment on non-residential projects over subsequent years.
Three forecasting methods were selected for the comparison:
Moving Average, Regression, and Exponential Smoothing. These three
methods are currently the most popular numerical methods available
for forecasting and estimation in capital planning. Comparing the
methods graphically, one can easily observe the difference between four
forecasting methods in terms of forecasting accuracy and tting quality.
Other than the method using Box Jenkins methodologies, the methods
all fail to grasp the complex variation of an original time-series,
especially the seasonal uctuation cycle that occurs each September.
4.3.5. Application and case study
The developed process for automated BoxJenkins modelling was
applied to the forecasting and management of a real capital project
case study in Edmonton, Alberta: the South LRT expansion. This case
study applied Box Jenkins-based forecasting and simulation technologies to perform cash ow prediction and risk analysis for the City of
Edmonton South LRT Section 1B Project. The analysis results were
used to verify and improve the original cash ow prediction for project
planning purposes. Compared to the original analysis from the local
engineering consulting company, the new analysis derived the
variation of project cost ination from historical data, instead of
from personal experience. It provided unique risk-based statistical
information for project planning and scenario analysis, which enabled
it to verify and improve upon the original analysis. This case study will
be presented in greater detail in a future paper.
5. Conclusions
This paper discussed in detail the research on the development
of automated BoxJenkins forecasting modelling and the
554
stats = x. sqrt(n);
end
% Rule of Thumb for critical level of t-test
crit = 2;
% Determine if the actual signicance exceeds the desired signicance
h = 0;
n = 0;
for i = 2:size
if abs(stats(i)) N = crit
n = n + 1;
end
end
if n N 0
h = 1;
end
Q test
Function [H, pValue, Qstatistic, criticalValue] = qtest(Series)
% This function perform Q test for input series
% Input:
%
Series: input series
% Output:
%
H: binary value for Q test. 1 pass and 0 not pass
%
pValue: signicance levels
%
Qstatistic: Q test statistics
%
criticalValue: critical value
[H, pValue, Qstatistic, criticalValue] = lbqtest(Series);
if H = 0
H = 1;
else
H = 0;
end
Parameters estimation
Function[Parameters,SEE,Innovations,ISCondition,Converge] = estimating(ROrder,MOrder,Series)
% This function calculate the parameters for ARIMA models and
output the
% parameter sets, warning message and other useful information
% Input:
%
ROrder: order of AR model
%
MOrder: order of MA model
%
Series: input series
% Output:
%
Parameters: tted ARIMA model parameters
%
SEE: squared estimation error
%
Innovations: estimation residuals
% ISCondition: binary value for Stationarity/Invertiblity conditions.
%
0 satised 1 not satised
% Converge: binary value for converge condition. 1 converge 0 not
%
converge
if ROrder = 0
InitAR = [];
else
for i = 1:ROrder
InitAR(i) = 0.1;
end
end
if MOrder = 0
InitMA = [];
else
for i = 1:MOrder
555
InitMA(i) = 0.1;
end
end
[r,c] = size(Series);
InitConstant = mean(Series,1) (1 0.1 ROrder);
spec = garchset(R,ROrder,M,MOrder,C,InitConstant,AR,InitAR,MA,InitMA,Display,Off);
[Coeff,Errors,LLF,Innovations,Sigma,Summary]=garcht(spec,Series);
if ROrder = 0
for i = 1:3
RPara(i) = 0;
end
else
RPara = garchget(Coeff,AR);
for i = ROrder + 1:3
RPara(i) = 0;
end
end
if MOrder = 0
for i = 1:3
MPara(i) = 0;
end
else
MPara = garchget(Coeff,MA);
for i = MOrder + 1:3
MPara(i) = 0;
end
end
Constant = garchget(Coeff,C);
Parameters = [Constant,RPara,MPara];
Converge = 0;
if Summary.converge = Function Converged to a Solution
Converge = 1;
end
ISCondition = 0;
if Summary.warning = No Warnings
ISCondition = 1;
end
if ROrder = 0
for i = 1:3
RParaSEE(i) = 0;
end
else
RParaSEE = Errors.AR;
for i = ROrder + 1:3
RParaSEE(i) = 0;
end
end
if MOrder = 0
for i = 1:3
MParaSEE(i) = 0;
end
else
MParaSEE = Errors.MA;
for i = MOrder + 1:3
ParaSEE(i) = 0;
end
end
ConstantSEE = Errors.C;
SEE = [ConstantSEE,RParaSEE,MParaSEE];
556
Forecasting
Function[MeanForecast,MeanRMSE] = forecasting(ROrder,MOrder,
Constant,AR,MA,Series,NumPeriods)
% This function forecast predetermined periods of future outcomes
according
% to tted ARIMA model
% Input:
%
ROrder: order of tted AR model
%
MOrder: order of tted MA model
%
Constant: tted constant
%
AR: tted AR model parameters
%
MA: tted MA model parameters
%
Series: input series
%
NumPeriods: input periods of future time
% Output:
%
MeanForecast: mean forecast
%
MeanRMSE: mean RMSE
if ROrder = 0
AR = [];
end
if MOrder = 0
MA = [];
end
Spec = garchset(R,ROrder,M,MOrder,C,Constant,AR,AR,MA,
MA,K,0.001,P,0,Q,0,Display,Off);
[SigmaForecast, MeanForecast, SigmaTotal, MeanRMSE] = garchpred
(Spec, Series, NumPeriods);
Inverse normal differencing
function [NewSeries] = invdifferencing(SeriesBeforeDiff,ForecastSeries,Order)
% This function transform the differenced series back to raw series
% Input:
%
SeriesBeforeDiff: series before normal differencing processing
%
ForecastSeries: forecasts series
%
Order: order of normal differencing
% Output:
%
NewSeries: Un-differenced series
[r1,c1] = size(SeriesBeforeDiff);
[r3,c3] = size(ForecastSeries);
NewSeries = SeriesBeforeDiff;
switch Order
case 0
NewSeries = [NewSeries;ForecastSeries];
case 1
for i = 1:r3
NewSeries(i + r1,c1) = ForecastSeries(i,c3) + NewSeries(i +
r1 1,c1);
end
case 2
for i = 1:r3
NewSeries(i+r1,c1)=ForecastSeries(i,c3)+2NewSeries(i+
r11,c1)NewSeries(i+r12,c1);
end
case 3
for i = 1:r3
NewSeries(i+r1,c1)=ForecastSeries(i,c3)+3NewSeries(i+
r11,c1)3NewSeries(i+r12,c1)+NewSeries(i+r13,c1);
end
otherwise
[Coeff,Errors,LLF,eFit,sFit] = garcht(spec,Series);
[eSim,sSim,ySim] = garchsim(Coeff,Horizon,nPaths,[],[],[],
eFit,sFit,Series).
Appendix B. BoxJenkins forecasting SPS template user manual
Overview
The BoxJenkins forecasting SPS template is a tool for automating
BoxJenkins forecasting modelling process and providing Box
Jenkins-based Monte Carlo simulation input modelling. The template,
in one aspect, allows average users to employ BoxJenkins forecasting
method in an easy and timesaving approach, without damaging the
powerful features of BoxJenkins methodologies in providing accurate
short and medium range forecasts for complex situations. In another
aspect, it provides a convenient tool to perform time series data tting
and parameters estimation, which are essential to BoxJenkins-based
Monte Carlo simulation.
This template was developed in Simphony with SPS template
development language. The structure of this template follows the
basic format of a traditional SPS template, that is, one father element
on the top and several child elements underneath. Basically, the father
element executes the function of collecting inputs and storing global
variables, while the child elements collectively perform the functionalities, which BoxJenkins forecasting modelling requires.
The template provides the user with one father element and four
child elements: (1) UBJ Forecasting Root, which is the highest element
in the hierarchy of elements, (2) Data Processing, which represent the
data processing functionality, (3) Parameter Estimating, which
represent the parameter estimation functionality, (4) Diagnostic
Check, which represent the diagnostic check functionality, and (5)
Forecasting, which represent the calculation of forecasts functionality.
Procedures of creating a BoxJenkins forecasting model
To create a new BoxJenkins forecasting model using the developed
SPS template, the following steps can be followed:
1. Create a UBJ Forecasting Root;
2. Dene the input parameters of the UBJ Forecasting Root, that is, input
the historical data series and specify the forecasting parameters;
3. Create a New Entity element from Common template in child
window as the source of new entity through the whole simulation
model;
4. Create and link all four child elements after the New Entity element
by the sequence of: Data Processing element Parameters Estimating elementDiagnostic Check element Forecasting element;
5. Dene the input parameters for Data Processing element or leave
them as default;
6. Link all the ve elements in child window by collecting the
connecting points from left to right;
7. Check the proper connection among elements and start the simulation;
8. Check the pop-up window for initial processing information or
processing alert and take proper actions to rene the inputs for
Data Processing element; and
9. Repeat Step 8 until successful completion information is displayed.
The following sections describe the details of using each element in
the template.
BoxJenkins forecasting element
UBJ Forecasting Root
This element allows the initialization of BoxJenkins forecasting
modelling. Users can input the original time series data, specify the
desired periods of forecasting, and choose the method of data updating.
557
558
Outputs:
Estimated parameters table: calculated model parameters and
model specication.
Parameters estimation errors table: calculated model parameters
estimation errors.
Statistics:
None.
Diagnostic check
This element performs the completed diagnostic checks. An array
table displays the checks results, in which zero value represents
Failure and one represents Pass.
All the good models from tentative models will be identied but
the best ARIMA model will be chosen for real forecasting.
Inputs:
None.
Outputs:
Diagnostic check results: Array table shows the check results. Note:
0 Failure, 1 Pass.
Good ARIMA models: Array table shows the ARIMA models who
pass all the check items.
Best ARIMA models: best ARIMA model with the least RMSE
compared with other good models.
Statistics:
None.
Forecasting
This element uses the best model identied by Diagnostic Check
element to perform forecasting operation. Period of forecasting will be
read from parent element and forecasting values will be displayed as