Sei sulla pagina 1di 60

90 DAYS

IN
BANKBILL
Introduction
As a result of development and
progress in state of Qatar and due to the
extensive involvement in shares and stock
markets we decide to do this project in this
field.
Since the State of Qatar has relatively new
experience compared with a well established stock
market we decided to seek data from a state that
has long time experience in this area.
After homogenous search we came up with
a data for this project from the Australian
stock market, which was founded since the
nineteenth century .Also , Australian stock
market has its own rules and regulation .
First we will define what does it
mean of shares , then we will give
an overview about Australian stock
market .
What is a share?
A share is simply part ownership of a business.
Owning shares means you participate in the company's
performance in the form of profits which can be given to
you as dividends and/or capital growth through the
value of your shares increasing. When you buy shares in
a company, you are buying a part of that company. This
means you share in the company's performance in the
form of profits.
Another question could be,
why invest in shares?
 It has a potential to outperform
other investments over the
long term.
It is easy to buy or sell.
 It can help you to diversify your
investments.
. The Australian Stock
Exchange Limited (ASX)
was formed in 1987 by
legislation of the
Australian Parliament
which enabled the
amalgamation of six
independent stock
exchanges that formerly
operated in the state
capital cities. Each of
those exchanges had a
history of share trading
dating back to the
nineteenth century.
Objectives :
The main objectives of this study are as follow :
To understand the effect (if any) of the independent
variables (x1,x2,…….,x15) on the price share ( y ) .
Determine the relation between the independent variables
and the dependent variables .
To study the development of share price and their changes
through time .
To compare between the prediction and forecasting of two
different approaches (multiple linear regression and time
series) for the represent share price data in Australian .
Learn from this project how to practice team work , how to
exercise report writing and how to apply the theoretical
methods and techniques of undergraduate statistics courses
in real life data .
Data
In this project we introduces the topic of (Multiple Linear
Regression)and (Time Series) by apply the methods and
techniques of the undergraduate courses we studied in
real life situation.
The data set for this project is about ( 90 Day Bank Bills )
.It consists of monthly observations on various share
price and financial variables that are affecting the share
price. Observation were recorded for the period between
October 1991 through August 1997, with a total number
of observations equal to seventy-one ( N = 71 ).

The data set is obtained from the internet at the


following web site address:
http://www.statsci.org/data/oz/bankbill.html
Variable Description :
There is one response variable which is the Bank Share Price
Index (Y) and fifteen explanatory variables defined below:
All Ords (X1) Number Of All Orders
Develop (X2) Development Rate
Mining (X3) Mining Price
Gold (X4) Gold Price
Build (X5) Building Loans
Prop (X6) Propaganda (commercials rate)
Indust (X7) Industrial Growth Rate
Energy (X8) Amount Of Energy Produced
Finance (X9) Financing Rate
Resource (X10) Number Of All Available Resources
Transport (X11) Transport
Retail (X12) Retail Price Rate
Unemploy (X13) Unemployment Rate
CPI (X14) Consumer Price Index
BankBill (X15) 90 Day Bank Bill Interest Rate
Source of the Data:
This data was obtained
from the web site of the
Department of
Mathematics, University of
Queensland which
originally obtained from the
Australian Market Quote -
AAP Financial
Markets.(AAP : Australian
Associated Press ) for the
period between (October
1991 through August
1997).
Regression Analysis:
We started the analysis with the stepwise
regression technique. The stepwise regression results
shows the influence of the data mainly affected by the
variables shown in shown in table (Variables
Entered/Removed).
There are only six variables entered for the final model
(Indust, Energy, BankBill, Prop, Mining and Unemploy).
A complete regression analysis is done by using the
SPSS & MINITAB and the results are shown below:
Therefore based on this stepwise regression we are
b
going to consider :
Variables Entered/Removed
Y = βo+ β1x1+ β2x2+ β3x3+ β4x4+ β5x5+ β6x6+ ε
Variables Variables
Where
M odelx1 isEntered
Industrial Rem oved M et hod
1 x2 is Energy
Unem ploy,
Prop,
x3 is BankBill
M ining,
. Ent er
x4 is Prop
Energy,
BankBill,
a
x5 is Mining
Indust

x6 is Unemploy
a. All request ed variables entered.
b. Dependent Variable: SharePrice
Matrix Plot :

Matrix Plot of share price; I ndust; Energy; Prop; Mining; ...


2000 3000 4000 1000 1100 1200 9.0 10.5 12.0
5000

3500
share price
2000
4000

Indust 3000

2000 2000

1500
Energy
1000

1200

Prop 1100
1000

1000

750 Mining

500 12.0

10.5
Unemploy
9.0
9

7
BankBill
5

2000 3500 5000 1000 1500 2000 500 750 1000 5 7 9


Scatterplot of share price vs I ndust; Energy; Prop; Mining; ...
I ndust Energy Prop

5000

4000

3000
shareprice

2000

2000 3000 4000 1000 1500 2000 1000 1100 1200


Mining Unemploy BankBill
5000

4000

3000

2000

500 750 1000 9.0 10.5 12.0 5 7 9

We can see that there is no linear relationship between the


Share Price variable and two predictors namely Mining and
UnEmployee. Based on this conclusion we have to make
transformation for them.

To achieve linearity we used the following transformation:


Mining: Ln(mining) , UnEmploye: 1/( UnEmploye) 2 .
Descriptive Statistics

N Range Minimum Maximum Sum Mean Std. Variance


Statistic Statistic Statistic Statistic Statistic Statistic Std. Error Deviation
Statistic Statistic
Bank 71 3481.2 1711.4 5192.6 206686.3 2911.075 95.6516 805.9749 649595.6
Mining 71 526.7 571.5 1098.2 62268.5 877.021 17.7819 149.8326 22449.805
Energy 71 1324.9 670.0 1994.9 72186.2 1016.707 40.0998 337.8873 114167.8
Prop 71 301.4 953.8 1255.2 76864.2 1082.594 8.3633 70.4703 4966.058
Unemploy 71 3.1 8.9 12.0 731.5 10.303 .1515 1.2763 1.629
BankBill 71 4.0 4.8 8.7 455.7 6.418 .1378 1.1612 1.348
Indust 71 2156 2237 4393 218091 3071.71 60.227 507.480 257536.3
Valid N (listwise) 71
First model:
The regression equation is
share price = - 1652 + 1.12 Indust + 0.550 Energy + 1.33
Prop - 0.680 Mining - 60.5 Unemploy + 52.4 BankBil
Table of coeffecint :
Predictor Coef SE Coef T P
Constant -1652.2 523.7 -3.15 0.002
Indust 1.12333 0.08907 12.61 0.000
Energy 0.5496 0.1119 4.91 0.000
Prop 1.3280 0.2790 4.76 0.000
Mining -0.6804 0.1270 -5.36 0.000
Unemploy -60.51 22.57 -2.68 0.009
BankBill 52.42 17.26 3.04 0.003
P-Values Test :
P-values for βo , β1, β2, β3, β4, β5, β6 > α =0.05
=> which means that they are all significantly.
Analysis of Variance
Source DF SS MS F P
Regression 6 45197257 7532876 1756.72 0.000
Residual Error 64 274433 4288
Total 70 45471690

Ho: β1= β2= β3= β4= β5= β6 V.S H1: βi ≠ 0 ; i= 1,2, … ,6 .

Since the P-value = 0.000 < α = 0.05 we conclude H1


So this means that there is a linear relationship between Share
price index and entered financial variables.
Model R R Square Adjusted Std.Error of the
R Square Estimate
1 0.997 0.994 0.993 65.48300

There is 99.4% of the variation of Share price index (Y) is


explained by the variation of the 6 predictors "entered financial
variables (x's) ".
Residual Plots for share price
Normal Probability Plot of the Residuals Residuals Versus the Fitted Values
99.9
99 100
90
Normal 50
It seems to be

Residual
distribution
Percent

0
50
constant
-50
10 variance of
-100
1
errors
0.1
-200 -100 0 100 200 2000 3000 4000 5000
Residual Fitted Value

Histogram of the Residuals Residuals Versus the Order of the Data


16
It seems to 100
12
be symmetry
Frequency

Residual 50
8 0
-50
4 It seems to be
-100
independent
0
-120 -60 0 60 120 1 5 10 15 20 25 30 35 40 45 50 55 60 65 70
Residual Observation Order
Correlations

SharePrice Indust Energy BankBill Prop Mining


Pearson Correlation SharePrice 1.000 .984 .967 -.093 .737 .559
Indust .984 1.000 .937 -.181 .788 .641
Energy .967 .937 1.000 -.101 .636 .445
BankBill -.093 -.181 -.101 1.000 -.548 .021
Prop .737 .788 .636 -.548 1.000 .537
Mining .559 .641 .445 .021 .537 1.000
Unemploy -.704 -.667 -.664 -.507 -.274 -.680
Sig. (1-tailed) SharePrice . .000 .000 .219 .000 .000
Indust .000 . .000 .066 .000 .000
Energy .000 .000 . .200 .000 .000
BankBill .219 .066 .200 . .000 .431
Prop .000 .000 .000 .000 . .000
Mining .000 .000 .000 .431 .000 .
Unemploy .000 .000 .000 .000 .010 .000
N SharePrice 71 71 71 71 71 71
As we can see from correlation matrix all of the variables are
Indust 71 71 71 71 71 71
significant with SharePrice except BankBill was not significant but
Energy 71 71 71 71 71 71
we kept it in the model because it has been entered by the Stepwise
BankBill 71 71 71 71 71 71
Prop 71 71 71 71 71 71
Method . However the R 2 and R 2 _adj are both high when this
Mining 71 71 71 71 71 71
variable included in the model.
Unemploy 71 71 71 71 71 71
Sequence Plot:
2.00000

Standardized Residual 1.00000

0.00000

-1.00000

-2.00000

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71

Sequence number

We can say that the data is


randomly selected.
Which implies that there is
no departure from independent
The Second Model:
We make suitable transformation on " Employee & Mining " .
The regression equation is
share price = - 94 + 1.07 Indust + 0.634 Energy + 1.51 Prop
- 477 ln mining + 22455 1/(employ)2 + 67.3
BankBill
Predictor Coef SE Coef T P
Constant -94.1 655.1 -0.14 0.886
Indust 1.07391 0.09076 11.83 0.000
Energy 0.6337 0.1104 5.74 0.000
Prop 1.5109 0.2879 5.25 0.000
ln mining -477.2 102.6 -4.65 0.000
1/(employ)2 22455 11647 1.93 0.058
BankBill 67.27 16.67 4.03 0.000

S = 68.3404 R-Sq = 99.3% R-Sq(adj) = 99.3%


As we can see from this table all P-values are significant
except (1/"employ"2) = 0.058
The Best Model :
In this model we removed UnEmploy because it was not
significant in the model.
The regression equation is
share price = - 964 + 1.06 Indust + 0.726 Energy + 1.53
Prop - 352 ln mining + 92.6 BankBill
Predictor Coef SE Coef T P
Constant -964.5 484.5 -1.99 0.051
Indust 1.06385 0.09249 11.50 0.000
Energy 0.7261 0.1015 7.15 0.000
Prop 1.5286 0.2937 5.20 0.000
ln mining -351.98 81.11 -4.34 0.000
BankBill 92.60 10.48 8.83 0.000
As we can see from this table all P-values are significant
S = 69.7541 R-Sq = 99.3% R-Sq(adj) = 99.3%
There is 99.3% of the variation of Share price index (Y) is
explained by the variation of the 5 predictors "entered
financial variables (x's) ".
Analysis of Variance
Source DF SS MS F P
Regression 5 45155424 9031085 1856.10 0.000
Residual Error 65 316266 4866
Total 70 45471690
Since the P-value = 0.000 < α = 0.05
So this means that there is a linear relationship between
Share price index and entered financial variables.
Correlations: share price; Indust; Energy; Prop; ln mining; BankBill

share price Indust Energy Prop ln mining


Indust 0.984
0.000

Energy 0.967 0.937


0.000 0.000

Prop 0.737 0.788 0.636


0.000 0.000 0.000

ln mining 0.582 0.659 0.465 0.566


0.000 0.000 0.000 0.000

BankBill -0.093 -0.181 -0.101 -0.548 0.020


0.438 0.131 0.401 0.000 0.867

All p-values are significant except BankBill .


Time Series Analysis:
. The second part of the data analysis for this project
is a time series analysis. First to have idea about the
data we will start by graphing the time series plot

Time Series Plot of Share price(y)


5500
It can be seen that this time series data
5000
has an exponential trend i.e. nonlinear
4500 increasing trend, a very weak seasonality
effect, and no outliers.
Share price(y)

4000

3500 We can see that during the starting


3000
period of this plot (the first 12
months) there is a decreasing
2500
nonlinear trend which is differ from
2000
the following periods of the graph
1500
1 7 14 21 28 35 42 49 56 63 70
Index
we tried to find the most suitable trend.

• Linear Regression Analysis:


Share price(y) versus t :

The regression equation is


Share price(y) = 2775 + 21.2 t

Accuracy Measures :
MAD = 624.136
P-Value = 0.453a
(2)Quadratic Regression Analysis:

Share price(y) versus t ; t2 :

The regression equation is


Share price(y) = 2613 + 91 t - 5.42 t2

Accuracy Measures:
MAD = 626.553
P-Value = 0.638
)3) Dummy Variables Regression Analysis:
Share price(y) versus t; S1; ... :

The regression equation is


Share price(y) = 5694- 327 t + 303 S1 + 673 S2 +
1073 S3 - 2444 S4 - 2117 S5 -1849 s6 - 1424 S7
1061 S8 - 727 S9-309 S10

Accuracy Measures :
MAD = 627.013
P-Value = 1.000
(4)Simple Trigonometric Regression Analysis:
Share price(y) versus t; sin2t; cos2t :

The regression equation is


Shareprice(y)=2852 + 9.1 t - 84 sin2t - 41 cos2t

Accuracy Measures
MAD = 627.979
P-Value = 0.840
(5)Complex Trigonometric Regression Analysis:
Share price(y) versus t; sin2t; cos2t; sin4t; cos4t :

The regression equation is


Share price(y) = 2949 - 6.1 t - 140 sin2t - 29 cos2t - 92
sin4t - 39 cos4t

Accuracy Measures:

MAD = 627.449
P-Value = 0.939
(6)Growth-curve Trend Analysis model:

Fitted Trend Equation


Yt = 1845.10 * (1.01175t)

Accuracy Measures
MAPE 9
MAD 263
MSD 101132
(7) Classical Decomposition Models:
Additive Time Series Decomposition
Multiplicative Time Series
for Share price(y)
Decomposition for Share price(y)
Fitted Trend Equation
Fitted Trend Equation
Yt == 1652.06
Yt 1653.36 ++ 35.0043*t
34.9404*t

Accuracy Measures
Accuracy :Measures:
MAPEMAPE11 11
MAD MAD301 301
MSD MSD
129828130670
(8)Single Exponential Smoothing for
Share price(y) :

Smoothing Constant
Alpha 1.21347

Accuracy Measures
MAPE 4.0
MAD 117.7
MSD 19537.9
(9)Double Exponential Smoothing for
Share price(y) :

Smoothing Constants
Alpha (level) 1.10354
Gamma (trend) 0.04382

Accuracy Measures
MAPE 4.1
MAD 117.8
MSD 19662.9
(10) Winters' Method for Share price(y) :
Additive Method
Multiplicative Method
Smoothing Constants
Smoothing Constants
Alpha (level) 0.2
Alpha (level) 0.2
Gamma (trend) 0.2
Gamma (trend) 0.2
Delta (seasonal) 0.2
Delta (seasonal) 0.2
Accuracy Measures
Accuracy Measures
MAPE 7.8
MAPE 8.2
MAD 222.6
MAD 234.8
MSD 75268.9
MSD 84390.3
ARIMA Model:
ARIMA
(Autoregress
ive
Integrated
Moving
Average)
model is
implemented
for further
analysis .
Based on this time series plot ,we can
conclude that there is non linear
increasing trend and increasing in and a
very weak seasonal effect .

Time Series Plot of Share price(y)


5500

5000

4500
Share price(y)

4000

3500

3000

2500

2000

1500
1 7 14 21 28 35 42 49 56 63 70
Index
So ,we should make some
modification on the Share price
index to make ARIMA model by
using the difference for (y), to
remove the trend.
Time Series Plot of 1st diff(y)
2000

1500
As we can see
the trend has
1000 been
1st diff(y)

500
removed .

-500

-1000
1 7 14 21 28 35 42 49 56 63 70
Index
After that we should plot ( ACF & PACF ) for diff(y)
,because it help us to identify the suitable orders
of ( p ,d , q ) & ( P , D , Q ) .
Autocorrelation Function for 1st diff(y)
(with 5% significance limits for the autocorrelations)

1.0
0.8
0.6
0.4
Autocorrelation

0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Lag
Partial Autocorrelation Function for 1st diff(y)
(with 5% significance limits for the partial autocorrelations)

1.0
0.8
0.6
Partial Autocorrelation

0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Lag
=> From these plots we can use suitable
orders which are ( p = 2 , d = 1 & q = 3 ) &
we will fixed them with ( D=1 ) , because we
have seasonality and make some tried to find
suitable orders for ( P & Q ) a depend on the
lowest MS , SS and largest number of
significant P-values to choose the best model
in ARIMA .
ARIMA Model: Share price(y):
In ARIMA model we use the 6 orders ( p,d,q )
with ( P,D,Q ), because in our Time Series we
have a weak seasonality and Trend .
The Best model:
We tried ( p=2 , d=1 , q=3 & P=3 ,D=1 ,Q=1)
Final Estimates of Parameters
Type Coef SE Coef T P
AR 1 1.4895 0.0341 43.67 0.000
AR 2 -0.9941 0.0387 -25.71 0.000
SAR 12 -1.5709 0.2703 -5.81 0.000
SAR 24 -1.5381 0.3048 -5.05 0.000
SAR 36 -0.7330 0.2682 -2.73 0.009
MA 1 1.6722 0.1477 11.32 0.000
MA 2 -1.2563 0.1617 -7.77 0.000
MA 3 0.2527 0.1337 1.89 0.065
SMA 12 0.3450 0.4862 0.71 0.481
Constant 26.143 3.673 7.12 0.000
Differencing: 1 regular, 1 seasonal of order 12
Number of observations: Original series 71, after differencing 58
Residuals: SS = 733968 (backforecasts excluded)
MS = 15291 DF = 48
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag 12 24 36 48
Chi-Square 25.0 35.4 39.1 43.2
DF 2 14 26 38
P-Value 0.000 0.001 0.048 0.259
From this out put we can see :
SS = 733968
MS = 15291
DF = 48
and we have two P-values not significant :
( P-value= 0.065 > α =0.05 )
( P-value= 0.481 > α =0.05 )
After we choose the best ARIMA model we should plot
the ACF & PACF of Residuals for Share price(y)to check
if the residuals are random ( idependent )or not.
ACF of Residuals for Share price(y)
(with 5% significance limits for the autocorrelations)

1.0 As we can see all the lags inside the band and
0.8 this means that the residuals are ( WN ) " White
0.6 noise process " which is and indication for a
0.4 random process .
Autocorrelation

0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0

3 6 9 12 15
Lag
PACF of Residuals for Share price(y)
(with 5% significance limits for the partial autocorrelations)

1.0
0.8
0.6
Partial A utocorrelation

0.4
0.2
0.0
-0.2
-0.4 we can see that the PACF confirmed the
-0.6 results in the ACF plot ( Residuals are
-0.8 independent ) .
-1.0

3 6 9 12 15
Lag
Residual Plots for Share price(y)
Normal Probability Plot of the Residuals Residuals Versus the Fitted Values
99.9
99 200

90
Normality
Of

Residual
Percent

0
50
errors.
-200
10
constant variance of
1
0.1 -400
errors.
-500 -250 0 250 500 2000 3000 4000 5000
Residual Fitted Value

Histogram of the Residuals Residuals Versus the Order of the Data


16
It seems to be 200
12
Symmetry.
Frequency

Residual
0
8

-200
4
random errors.
0 -400
-400 -300 -200 -100 0 100 200 1 5 10 15 20 25 30 35 40 45 50 55 60 65 70
Residual Observation Order
Percent of Errors :

As we can see from this table the predicted


values of the multiple regression analysis is more
accurate than the time series predicted value
since they have the lower errors.

ARIMA
Observat Time series T.S percet Regressin Reg. percent ARIM Percent
ion predict errors predict errors predict errors

4349.3 4174.41 4.02% 4337.76 0.27% 4392.73 -0.4343%


4579.3 4174.41 8.84% 4591.36 -0.26% 4647.81 -0.6851%
4919.3 4174.41 15.14% 4968.32 -0.99% 4701.46 2.1784%
5192.6 4174.41 19.61% 5166.52 0.5% 4975.42 2.1718%
4857.8 4174.41 14.07% 4808.45 1.02% 5031.14 -1.7334%
6000

5000

4000

observation
3000
SingleT. S. predict
2000 Regression Predict
ARIMA Predict
1000

0
1 2 3 4 5
Summary and Conclusion

The analysis in this project were based on


two main approaches of data analysis,
multiple linear regression and time series
analysis.
Multiple linear regression:
According to the stepwise regression
method was used there are only six
variables entered "significant" in the model
namely they are: ( unemploy , Prop. ,
mining , energy , bankbill , indust. )
The regression equation based on the six
entered values is given by:
The regression equation is
share price = - 964 + 1.06 Indust + 0.726
Energy + 1.53 + Prop - 352 ln mining +
92.6 BankBill
With R2(adj) = 0.993 it is approximately
equal to R2 =0.993 , so we can say this is an
indication for a "" good model "" .
Based on the results in the ANOVA table
the linear regression relationship is tested
using the f hypotheses.

H0: β1= β2= β3= β4= β5= β6 = 0


H1: βi ≠0 ; i = 1,2, … ,6 .

Since the P-value = 0.000 we reject H0


So, the model is significant and this
means that there is a linear relationship
between Share Price Index and entered
financial variables.
Time Series Analysis:
The second approach was time series
analysis in which we tried to find the most suitable
trend using MINITAB facilities and we choose the
best fit model depend on the lowest MAD. Based on
that the best model was the Single Exponential
Smoothing for Share Price (y) :
ARIMA:
as we sow the ARIMA model was the second
approach after the Multiple Linear Regression
more accurate to fit the Time Series.
Finally we also compared the predicted values
for the two approaches and it turned out the
multiple regression model had predicted more
accurate values than the time series model.
Limitations and Recommendations:
We believe there are two limitations in this project, one is
the data collected for this project is relatively small and
the second is not from our region area. Our
recommendation for future research in share price should
consider avoiding these limitations.
References

1- www.statsci.org/data/oz/bankbill.html
2- www.statsci.org/data/multiple.html
3- www.asx.com
4- www.statsoft.com
5- www.itl.nist.gov/div898/handbook/pmc/section4/pmc4.htm
6-Applied Linear Regression Models: By Neter , Kutner ,
Nachtsheim . Fourth Edition (2004) .
7-Forecasting , Time Series And Regression : By Bowerman ,
O'Connell and Koehler . Fourth Edition

To contact us you can visit our Web Site :


Project2006.jeeran.com
THANK YOU FOR GIVING US
YOUR ATTINTION AND WE
WILL BE GLAD TO ANSWER
ANY QUESTION.

Potrebbero piacerti anche