Sei sulla pagina 1di 16

Spatial Interpolation of Population Change (%) Across Canada

Geofficient Strategies Analysis and Formal Report 21 March 2014



1.0 Introduction
The population of Canada has been steadily increasing since Confederation in 1867. In 2013,
the population finally reached the 35 million mark
1
. With population trends come questions of
why and where. The goal of this study is to identify areas of population change and to explore
the possibilities of why it is increasing or decreasing in certain areas. A geo-statistical analysis
will be undertaken for all of Canada with the help of 147 cities and their population trend from
2006 to 2011. Based on the cities with available data, the population change of surrounding
areas will be predicted to determine nation-wide trends.

The essential part of this study involves performing spatial interpolation using two methods:
Kriging and Inverse Distance Weighting (IDW). Spatial interpolation is the process of predicting
values of a variable for unknown points. The 147 cities that have known population change
values will be used to aid in the prediction of parts of Canada that have unknown population
change values. The unknown values are predicted based on various parameters associated with
each method and are primarily influenced by the surrounding areas of known values. These
methods will be further explored throughout the report.
1.1 Study Area
Canada is the study area but is bounded by the extent of populated cities. The northern
boundary for this study lies at 63N latitude where Yellowknife, Northwest Territories is located
while the southern boundary is at 42N. Since there is not a sufficient population north of 63N,
this area has been omitted. All cities range fall in the range of 136W to 52W longitude as seen
in (Figure 1).


1
Canadian Population Surpasses 35 Million, CBC News, September 26 2013
Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
1


Figure 1: Study Area and Cities
Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
2

1.2 Data Analysis
Population statistics were obtained from StatsCan
2
by obtaining the top 147 municipalities with
populations over 5 000. Population change percentage from 2006 to 2011 is the main variable
of the study with positive values indicating an increase and negative values indicating
population decrease. Locational data for the 147 cities was obtained from Geocoder.ca.
Latitude and Longitude were used as the Y and X coordinates.
3
Exploratory data analysis was
undertaken to identify characteristics of the data such as mean, variance, directional influences,
and any trends in the data.

The population change data is almost normally distributed but positively skewed (Figure 2).


Figure 2: Population Change % Histogram
This indicates that most of the data lies in lower values around the median. The median is 4.1%
compared to 5.0%, supporting the positively skewed dataset. Standard deviation is 6.3% which
is the average difference of a population change value from the mean. It is expected for
normally distributed data that 99% of the values should fall within +/-3 Standard deviations
from the mean (Smith), which is between -13.9% to 23.9%. There are two data points that are
outside of this range: Wood Buffalo 27.1% and Okotoks at 42.9%. The range of the entire data
set is -4.5% to 42.9% but there is only the one value above 28%. Okotoks can be considered an
outlier since its value is unlike or near any others. Upon further investigation, this city is in

2
Statistics Canada, 2011 Census Report
3
Obtained from Geocoder.ca
Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
3

Alberta which is experiencing a provincial wide population increase
4
. This point will be kept in
the study as it is indicative of an area where a population boom is occurring. The high kurtosis
value of 11.9 indicates peakedness and a long tail
5
which is visible on the histogram.
The Q-Q plot compares the distribution of observed points to a theoretical normal distribution
(Figure 3).


Figure 3: Normal Q-Q Plot of Population Change
The data around the median value is closely related to the theoretical normal. Both ends of the
distribution are where values start to differ most. This is consistent with skewed data and data
with a large heavy tail
6
. Despite this, there are no significant issues with the distribution of the
population change dataset.

The spatial distribution of cities is not normal (Figure 4).

A

4
Statistics Canada, Annual Demographic Estimates, 2013
5
David Lane , Introduction To Statistics
6
David Lane, Introduction To Statistics
Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
4

B
Figure 4: Histograms- A-Latitude, B- Longitude
Latitude is positively skewed with many cities residing at lower latitudes while longitude has a
multi-modal distribution. Additionally, by looking at the map of the study area, it is clear that
the cities are not evenly distributed throughout Canada. Cities were chosen based on available
data and were kept to a standard of having a population at least above 5000. The size and
sparseness of Canada makes leads to difficulty in obtaining spatially normal sample locations.
This concept will be further explored in the Discussion section 4.0.

A trend analysis between all three variables: Longitude (X),Latitude (Y), and Population Change
(Z) can be seen in (Figure 5).



Figure 5: Trend Analysis For X, Y and Z
Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
5

This graph shows the trends between XY, YZ, and XZ. The trend line for XZ, Longitude vs
Population Change shows a minimal trend at the 2
nd
order polynomial level. At this level, YZ,
Latitude vs Population Change shows a stronger trend due to the concentration of lower
latitude cities. These small second order trends can be eliminated during the Kriging process.

A semi-variogram describes the spatial relationship between data points
7
. In this study, it
compares the distance between every pair of cities versus their difference in population change
value and plots the result on a graph (Figure 6).


Figure 6: Semi-Variogram For Population Change
This semi-variogram has a mostly horizontal form. This means that there is minimal spatial
autocorrelation at a Canada-wide scale
7
. This can be supported by the fact that the population
in Calgary has no effect on the population of Toronto. Since the sampling distance greatly varies
from coast to coast of Canada, the semi-variogram should ideally be adjusted for each
neighbourhood during spatial interpolation.
In summary, the data for population change is relatively normal with a positive skew indicating
a large portion of values around the median of 4.1%. One data point, Okotoks, AB, with a
unique 42.9% population change, will be kept as this area has been experiencing high
population increase compared with the whole of Canada. It has been determined that there is

7
Gregg Babish, Geostats Without Tears
Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
6

little to no spatial autocorrelation with cities that are on opposite sides of the country. This will
be kept in mind when setting neighbourhoods for spatial interpolation.
2.0 Methodology
The two methods used for spatial interpolation were IDW and Kriging. These processes were
run on ArcGIS 10.1 using the Geostatistical Wizard. All procedures and map features were set
with the coordinate system of GCS North American 1927. The information regarding Data
Analysis was used to create the best possible spatial interpolation models.

2.1 Inverse Distance Weighting
Inverse Distance Weighting is an effective way to take an initial look at a dataset. It has limited
input parameters and does not make any assumptions about the data
8
. This method predicts
unknown points by assigning weights to the surrounding known data points. When an unknown
point is interpolated, the closest data points have higher influences then those at greater
distances. Therefore, proximity has the greatest effect on unknown points in IDW. In ArcGIS,
this is determined by the Power function. The following parameters were used to produce the
IDW result:


Figure 7: IDW Parameters

8
ArcGIS 10.1 Geostatistical Wizard
These parameters were tested to obtain the
lowest Root Mean Square. The power
function was set at 1 to give higher weights
to closer cities with diminishing weights
given to cities farther from the unknown
point. The minimum number of neighbours
to include was set to 5 with a maximum of
15 in order for the prediction process to use
a limited neighbourhood. This was to limit
the amount of influence cities further away
had on the prediction.

Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
7

2.2 Kriging
Kriging is a much more complex model for spatial interpolation. It has many more input
parameters and provides an interactive process for adjusting these parameters in ArcGIS. This
method relies on statistical relationships between the measured points
9
. It uses the semi-
variogram and weights obtained from it in the interpolation process
10
.
Simple Kriging was used since it is the method that accepts a dataset containing negative
values. The following parameters were used:

Figure 8: Parameters for Kriging


9
ArcGIS 10.1 Help Files
10
Gregg Babish, Geostatistics Without Tears
A normal score transformation was used to normalize the
data and make variances more consistent throughout the
study area
11
. This was the only available option with the use
of negative data. Declustering was undertaken to reduce
the effect of preferential sampling and to correct the data
distribution estimate
1
. Second order trends were removed
as per the Trend Analysis in Section 1.2. The Power function
was increased to 2 in this method to give higher weights to
those cities closer to the prediction point. The reasoning
behind this is that closer cities should have more of an
influence than cities further away.
The Searching Neighbourhood used similar parameters as
the IDW method. Five neighbours were set as a minimum
since some areas were isolated from the bulk of the cities.
Fifteen was set at a maximum in order to only include
nearby cities. By setting limits of the number of influencing
cities, it reduced the influence of cities further away that
most likely have no correlation with population change.
The number of lags and lag size was set to include all points
on the semi-variogram. Other settings were kept as the
defaults by ArcGIS.

Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
8

3.0 Results
The IDW result shows an effective initial display of population change across Canada (Figure 9,
next page). The map shows that Eastern Canada is experiencing a population decrease although
certain areas are experiencing an increase. The London to Montreal corridor is producing low
population increases. Southern parts of New Brunswick and Nova Scotia are also appear to be
attract areas for population increase.
The most noticeable aspect of the map is the higher population increases in Western Canada.
The highest population increase is occurring in Alberta and Western Saskatchewan. Current
trends support this analysis as Alberta has the highest growth rate of any province
11
. There are
small areas that have higher population increases in Alberta which play a role in making the
whole province an attractive living area.
Northern Canada, north of Yellowknife, NWT, show varying trends for population change. Since
there was no data in this area, these results cannot be reliable. The influence of western
Canada seemed to create positive trends in the north. This area was added for completion
purposes.

11
James Wood, Calgary Herald, Albertas Population Cracks 4 Million
Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
9



Figure 9: IDW Prediction Surface For Population Change In Canada
Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
10

The Kriging result shows a more organized spatial interpolation result (Figure 10, next page). It
shows similar trends to the IDW result but differ visually. Eastern Canada is heavily coded with
blue indicating populations are decreasing. The Golden Horseshoe is experiencing population
increases while an area just east of Toronto seems to be decreasing. The decreasing trend in
central Ontario seems to have affected this area east of Toronto. The southern tip on Ontario is
also experiencing a population decrease.
Western Canada is experiencing the highest population increases again. Alberta has a similar
pattern to IDW with a high population increase. Two distinct areas in British Columbia appear
to be experiencing a population decrease that was not apparent in IDW. The Yukon Territory
has the highest population increase as the dark red shading shows. There were not many data
points in this area so the prediction is most likely inaccurate.
Northern Canada including the NWT, Nunavut and Northern Quebec appear to be experiencing
population decreases. Since no data points were north of 63N latitude, these results cannot be
too accurate. After further research, Nunavut has been experiencing population increases so
this data can be disregarded
12
.

12
Statistics Canada, 2006 Aboriginal Census Profiles
Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
11


Figure 10: Kriging Prediction Surface For Population Change In Canada
Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
12

4.0 Discussion
This section will discuss the validity of both methods and results.
IDW is an effective method for taking a preliminary look at a dataset. Since there are not many
inputs, it greatly limits its ability to predict surfaces with complex datasets. The predicted
surface for IDW did a fair job in predicting the real life surface. The graphs in Figure 11 show
how well the model predicted population change percentage.

Figure 11: IDW Cross Validation Results
Both graphs show that the predicted points vary along the trend line. They are somewhat close
to the line indicating an effective model. There are minimal outliers in the graphs. The model
was tested by adjusting the parameters such as neighbourhood, sector type, and power to
attempt to get the lowest possible Root Mean Square (RMS). The RMS value is 5.25 which was
the minimum value after testing the various parameters. Factors such as widely dispersed
sampling, clustering, and locational correlation negatively affected the prediction surface.

Kriging uses many input parameters and even more for advanced users. This can be a drawback
for first-time users. It can be a very effective for users familiar with this method. The Kriging
result produced for this study had the following validation results:
Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
13



Figure 12: Kriging Cross Validation Results

The Predicted graph (left) shows a large cluster of points in the center of the trend line. This
indicates that the prediction process was fairly successful. There are a few outliers which had
much lower measured values than what was predicted for them. The Error graph (right) shows
a similar but negative trend. The clustering indicates that values had similar error values. The
average standard error value was quite high at about 4.50 %. Considering the original data was
heavily centered around the median, an error of 4.50% could skew the effectiveness of this
predicted surface.

Overall, Kriging provided a slightly better result than IDW. The trend of a population shift
towards Western Canada is clear and distinct in the Kriging result. The biggest challenge when
performing spatial interpolation was the data set. Having negative values greatly reduced the
ability to choose between Kriging types and transformation methods since they did not work
with negative values. Furthermore, the distribution of the dataset all across Canada may have
been an impediment to the success of the model. Population change of cities on the east coast
would most likely have little effect on cities of the west. This was acknowledged and partially
mitigated by setting Search Neighbourhood values to include only those nearby cities. An
Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
14

improvement to this study could have focused on an individual province. Cities could then be
chosen to represent all areas of the province. Cities in Canada are in no way normally
distributed. Most of the population lies in the southern portion of the country with the north
being very sparsely populated.
5.0 Conclusion
Kriging is an advanced interpolation method that can outmatch IDW for data sets with
normality issues, directional influences, and global trends. IDW provides quick and easy
interpolation method for basic analysis. These methods both aided in the analysis of population
change in Canada.
The results of this study show that the Canadian population is increasing at the highest rates in
Western Canada. Both models show that Alberta is the province with the highest population
growth rates. The data north of the 63 latitude is not to be held to reliable standards as it was
lacking measured data points. Also, since the population is not spread out among these
northern areas, it is not necessary to even create a predicted surface for it. It was created for
this study merely for continuity purposes. The predicted surfaces also show that Eastern
Canada has lower population growth rates and even negative growth rates. The exceptions
include: the Golden Horseshoe, the Ottawa region, and southern parts of New Brunswick and
Nova Scotia. Still, none of these eastern regions have similar growth rates to the west.










Spatial Interpolation of Population Change (%) Across Canada
Geofficient Strategies Analysis and Formal Report 21 March 2014
15

Bibliography

Babish, G. (2000). Geostatistics Without Tears. Saskatchewan.
CBC News. (2013, September 26). Canadian population surpasses 35 million. Retrieved Feb 24, 2014,
from CBC: http://www.cbc.ca/news/canadian-population-surpasses-35-million-1.1869011
ESRI. (2013). Geostatistical Wizard. ArcGIS 10.1.
Lane, D. (n.d.). Chapter 8 Advanced Graphs. Retrieved 2014, from Online Statistics Education: An
Interactive Multimedia Course of Study.
Smith, I. (2014). Deliverable 4. Geostatistical Analysis of Student Obtained SPatial data. NOTL, ON,
Canada.
Smith, I. (2014). GISC 9308- Introduction To Statistics. Niagara College.
Statistics Canada. (2013, 06 19). Annual Demographic Estimates. Retrieved 2014, from Statistics Canada:
http://www.statcan.gc.ca/pub/91-215-x/2012000/part-partie1-eng.htm
Statistics Canada. (2014, January 13). Population and Dwelling Counts. Retrieved January 31, 2014, from
Statistics Canada: http://www12.statcan.gc.ca/census-recensement/2011/dp-pd/hlt-fst/pd-
pl/Table-Tableau.cfm?LANG=Eng&T=307&S=11&O=A&RPP=699
Wood, J. (2013, 09 26). Alberta Population Cracks 4 Million. Retrieved 2014, from Calgary Herald.

Potrebbero piacerti anche