Sei sulla pagina 1di 29

3/5/14

Accuracy Assessment
of
Spa1al Data

outline
Introduc2on
Spa2al data quality
Objec2ve
Type of spa2al data quality
Posi2onal/Geometric Accuracy
NSSDA
RMSE
Accuracy Assessment Standard
AHribute/Thema2c Accuracy
Mo2va2on
Error Matrix

1
3/5/14

Spa2al Data Quality


The quality of spa2al data, as indeed of any data,
is crucial to its eec1ve use.
Quality is a measure of the dierence between
the data and the reality that they represent, and
becomes poorer as the data and the
corresponding reality diverge.
Quality is dicult to aHach to individual features
in a database, but instead must be described in
terms of the joint quality of pairs of features,
through measures of rela2ve posi2onal accuracy,
covariance or correla2on.

Spa2al Data Quality


the concept of quality encompasses a much
larger spectrum and aects the en2re process
of the acquisi2on, management,
communica2on, and use of geographic data.

2
3/5/14

Spa2al Data Quality


(objec2ves) (Congalton, Russell G.)
To review the current knowledge of accuracy
assessment methods
To s2mulate the one to further the progression
of diagnos2c techniques and informa2on to
support the appropriate applica2on of spa2al
data.
The ul2mate objec2ve is to mo2vate everyone to
conduct or demand an appropriate accuracy
assessment or valida2on and make certain it is
included as an essen2al metadata element.

Type of Data Accuracy


Akurasi data spasial terdiri atas (ISO TC-211):
q Kelengkapan : keberadaan atau ke2daklengkapan atas unsur dari peta,
atribut (deskripsinya), dan hubungan antar unsurnya (rela%onships);
q Logical consistency : derajat atau 2ngkat ketaatan yang terkait dengan
aturan logis dari struktur data, atribut, dan hubungan antar unsurnya
(struktur data dapat berupa konseptual, logikal ataupun sikal);
q Akurasi temporal : akurasi dari rentang waktu dari atribut yang
disampaikan dalam data (temporal a/ribute) dan hubungan antar
unsurnya (temporal rela%onship);
q Akurasi posisi : akurasi yang terkait dengan keteli2an posisi suatu unsur;
q Akurasi tema1k atau akurasi atribut : akurasi yang menjelaskan keteli2an
atribut yang bersifat kuan2ta2f dan kebenaran dari aHribut kualita2f (non-
quan%ta%f) serta keteli2an klasikasi unsur dan hubungan antar unsurnya.

3
3/5/14



Akurasi posisi

Positional Accuracy



Reference:U.S. Federal Geographic Data Committee (FGDC)

Geospatial Positioning Accuracy Standards

Part 3: National Standards for Spatial Data Accuracy (NSSDA)



FGDC-STD-007.3-1998



http://www.fgdc.gov/standards/status/sub1_3.html

http://www.fgdc.gov/standards/projects/FGDC-standards-projects/


accuracy/part3/tr96

http://www.mnplan.state.mn.us/pdf/1999/lmic/nssda_o.pdf

4
3/5/14

Positional Accuracy



The NSSDA uses root-mean-square error (RMSE) to estimate

positional accuracy. RMSE is the square root of the average of the

set of squared differences between dataset coordinate values and

coordinate values from an independent source of higher accuracy for

identical points.



Accuracy is reported in ground distances at the 95% confidence level.

This means that 95% of the check points had errors equal to or smaller

than the reported accuracy value.



A minimum of 20 check points shall be tested, distributed to reflect

the geographic area of interest and the distribution of error in the dataset.

When 20 points are tested, the 95% confidence level allows one point

to fall outside the reported accuracy value.

There are seven steps in applying the NSSDA:


1) Determine if the test involves horizontal accuracy,
ver1cal accuracy or both.
2) Select a set of test points from the data set being
evaluated.
3) Select an independent data set of higher accuracy that
corresponds to the data set being tested.
4) Collect measurements from iden2cal points from each
of those two sources.
5) Calculate a posi2onal accuracy sta1s1c using either the
horizontal or ver2cal accuracy sta2s2c worksheet.
6) Prepare an accuracy statement in a standardized report
form.
7) Include that report in a comprehensive descrip2on of
the data set called metadata.

5
3/5/14

2. Selecting test points (TPs):


A data sets accuracy is tested by comparing the
coordinates of several points within the data set to the
coordinates of the same points from an independent
data set of greater accuracy.
Points used for this comparison must be well-defined.
They must be easy to find and measure in both the
data set being tested and in the independent data set.
Twenty or more test points are required to conduct a
statistically significant accuracy evaluation regardless
of the size of the data set or area of coverage.

If fewer than 20 test points are available:


Three alternatives for determining positional
accuracy:
1) deductive estimate,
2) internal evidence and
3) comparison to source.

See: mcmcweb.er.usgs.gov/sdts/

6
3/5/14

3. Selecting an independent data set (TPs):


The independent data set must be acquired separately
from the data set being tested.
It should be of the highest accuracy available.
In general, the independent data set should be three
times more accurate than the expected accuracy of
the test data set.
If an independent data set that meets this criterion
cannot be found, a data set of the highest accuracy
feasible should be used.
The accuracy of the independent data set should
always be reported in the metadata.

Test points (TPs) distribu1on


The areal extent of the
independent data set
should approximate that
of the original data set.
When the tested data set
covers a rectangular area
and is believed to be
uniformly accurate, an
ideal distribution of test
points allows for at least
20 percent to be located
in each quadrant

7
3/5/14

Test points (TPs) distribu1on


Test points should be spaced
at intervals of at least 10
percent of the diagonal
distance across the
rectangular data set

Root Mean Square (RMS) Error


RMS Error per TP, Ri:
Ri = SQRT( XRi2 + YRi2)
where Ri = RMS Error per TPi
XRi = Xresidual for TPi (Xsource Xreference)
YRi = Yresidual for TPi (Ysource Yreference)
TPSource
Xresidual

RMSerror
Yresidual
TPReference

8
3/5/14

Total RMS Error


X RMS error, Rx Y RMS error, Ry
Rx = SQRT( 1/n (SUM XRi2)) Ry = SQRT( 1/n (SUM YRi2))

where XRi = Xresidual for TPi


YRi = Yresidual for TPi
n = total number of TPs
Total RMS, T = SQRT(Rx2 + Ry2)
Error Contribution by Individual TP, Ei :
Ei = Ri / T

Geometric Accuracy Measures


RMS generally equivalent to 1 standard error in
sta2s2cal parlance. Approximately 68% of the
residual errors are less than + or - the threshold
distance (assuming a normal distribu2on).
Alterna2ve measure is CEXX% or a circular error
of XX%, i.e., 90% of the residual errors are less
than +- the threshold distance. This would be a
more restric2ve standard if set at the same
distance threshold.

9
3/5/14

Accuracy Assessment
(standard)
Na2onal Mapping Accuracy Standard (NMAS),
US.Bureau of the Budget, 1947 uses level of
conden2al 90% (Circular Mapping Accuracy
Standard: CMAS / Ver2cal Mapping Accuracy
Standard: VMAS)
FGDC: Na2onal Standard for Spa2al Data
Accuracy (NSSDA) , 1998 uses level of
conden2al 95% (Circular Error : CE95/ Linear
Error: LE95)

Linear Error: LE
+ LE=RMSE, LE90%, LE95%

Circular Error: CE
+ CE=RMSE, CE90%, CE95%

NSSDA (Accuracy) denes LE95% and CE95%

10
3/5/14

Conden2al Level of Accuracy


For horizontal accuracy (Circular Error)
CMAS = CE90% = 1.5175 RMSEr
NSSDA = CE95% = 1.7308 RMSEr
For ver1cal accuracy (Linear Error)
VMAS = LE90% = 1.6449 RMSEz
NSSDA = LE95% = 1.9600 RMSEz

Map Accuracy Standard


ASPRS Large Scale
Mapping Accuracy
Standard

11
3/5/14

Map Accuracy Standard


United States Na2onal Map Accuracy Standards:
1. Larger than 1:20000 scale:
not more than 10 percent of the points tested
can be in error by more than 1/30 inch in map
paper.
(~0.85mm, 90% condence level)
2. 1:20000 scale or smaller:
not more than 10 percent of the points tested
can be in error by more than 1/50 inch.
(~0.51mm, 90% condence level)



Worksheet Accuracy Assessment
(NSSDA)

12
3/5/14

13
3/5/14

examples:
Minnesota Department of Transporta1on Map
Scale 1:1000 (from 1:3000 aerial photography)
Test to use: ver2cal and horizontal accuracy
Data set: 1. DEM (contour and grid)
2. digital topographic map
Selec2ng test points:
o Ver2cal: 296 random control points (accuracy 10-15mm)
o Horizontal: 40 well-dened points (accuracy 10-15mm)
Examples of these include manholes, catch basins and right-angle
intersec2ons of objects such as sidewalks.
Forty points were chosen rather than the minimum of 20 because
they were fairly easy to collect and because of the long narrow shape
of the corridor. Having the extra points opened the possibility of
comparing a test of the 20 easternmost control points with a test of
the 20 westernmost control points.

14
3/5/14



Akurasi tema1k atau akurasi atribut

15
3/5/14

Mo2va2on
Classied thema2c maps are produced for a
wide variety of resources: soil types or
proper2es, land cover, land use, forest
inventory, and many more.
These maps are not very useful without
quan2ta2ve statements about their accuracy

Mo2va2on
Map users must know the quality of the map
for their intended uses, and
map producers must evaluate the success of
their mapping eorts.
Both users and producers may want to
compare several maps to see
which is best, or to see how well they agree.

16
3/5/14

Mo2va2on
(Congalton, Russell G.)
The need for assessing the accuracy of a map
generated from any remotely sensed data has
become universally recognized as an integral
project component.
In the last few years, most projects have required
that a certain level of accuracy be achieved for
the project and map to be deemed a success.
With the widespread applica2on of geographic
informa2on systems (GIS) employing remotely
sensed data as layers, the need for such an
assessment has become even more cri2cal.

Mo2va2on
There are a number of reasons why this
assessment is so important, including
(Congalton, Russell G.):
The need to perform a self-evalua2on and to
learn from your mistakes
The ability to compare method/algorithms/
analysts quan2ta2vely
The desire to use the resul2ng maps/spa2al
informa2on in some decision-making process

17
3/5/14

Mo2va2on
It is absolutely necessary to take some steps
toward assessing the accuracy or validity of that
map.
There are a number of ways to inves2gate the
accuracy/error in spa2al data including, but not
limited to:
1. visual inspec2on map look good,
2. nonsite-specic analysis,
3. genera2ng dierence images,
4. error budget analysis, and
5. quan2ta2ve accuracy assessment.

Map Accuracy Measurement


n Simplest comparison is total area of each
class
Called non-site-specific accuracy
Imperfect because underestimation in one
area can be compensated by overestimation
in another
n Called inventory error

18
3/5/14

Map Accuracy Measurement


n Sitespecific accuracy is based on
detailed assessment between the two
maps
Inmost cases pixels are the unit of
comparison
Known as classification error
n Thisis misidentification of pixels
n There may also be boundary errors

Error Matrix
n Inthe evaluation of classification errors, a
classification error matrix is typically
formed.
This matrix is sometimes called confusion
matrix or contingency table.
n In
this table, classification is given as rows
and verification (ground truth) is given as
columns for each sample point.

19
3/5/14

Error Matrix
n The diagonal elements in this matrix indicate numbers of
sample for which the classification results agree with the
reference data.
n Off diagonal elements in each row present the numbers
of sample that has been misclassified by the classifier,
i.e., the classifier is committing a label to those samples which
actually belong to other labels. The misclassification error is
called commission error.
n The off-diagonal elements in each column are those
samples being omitted by the classifier.
Therefore, the misclassification error is also called omission
error.

Error Matrix

20
3/5/14

Error Matrix
n The most common error estimate is the overall
accuracy:
()==1/
n = total number of TPs

n From the example of confusion matrix, we can


obtain = (28 + 15 + 20)/100 = 63%.

Error Matrix
n More specific measures are needed
because the overall accuracy does not
indicate how the accuracy is distributed
across the individual categories.
Thecategories could, and frequently do,
exhibit drastically differing accuracies but
overall accuracy method considers these
categories as having equivalent or similar
accuracies.

21
3/5/14

Error Matrix
n From the confusion matrix, it can be seen that at
least two methods can be used to determine
individual category accuracies.
n (1) The ratio between the number of correctly
classified and the row total
theuser's accuracy - because users are concerned
about what percentage of the classes has been
correctly classified.
n (2) The ratio between the number of correctly
classified and the column total
is called the producer's accuracy.

Error Matrix
n A more appropriate way of presenting the
individual classification accuracies.
Commission error = 1 - user's accuracy
Omission error = 1 - producer's accuracy

==1/
=/+
=/+
+==1
+==1

22
3/5/14

Error Matrix

Error Matrix

23
3/5/14

Accuracy e.g. Forest class


Overall. Accuracy = Sum.of .diagonal = 28 + 15 + 20 = 0.63
Grand .total 100

off .diagonal.row.elements 14 + 15
Comission.Error = = = 0.51
total.of .row 57
off .diagonal.column.elements 1 + 1
Omission.Error = = = .067
total.of .column 30
diagonal. for.class
Mapping.accuracy =
diagonal + off .diag .rows + off .diag .columns
28
= = 0.475
28 + (14 + 15) + (1 + 1)

n Producers accuracy = 1- 0.067=0.933


0r 93.3 %
n Consumers Accuracy = 1-0.51=0.49
Or 49%

24
3/5/14

The Kappa coefficient


n TheKappa coefficient (K) measures the
relationship between beyond chance
agreement and expected disagreement.
Thismeasure uses all elements in the matrix
and not just the diagonal ones.
The estimate of Kappa is the proportion of
agreement after chance agreement is
removed from consideration:

The Kappa coefficient


n =/1 = (overall chance)/
(1-chance)
po = proportion of units which agree, = overall accuracy
= ==1
pc = proportion of units for expected chance agreement
= ==1++
pi+ = row subtotal of pij for row i
p+i = column subtotal of pij for column I
=/ , +==1 ,
+==1

25
3/5/14

Error Matrix

Grand Total = 100, Total correct = 63, Observed correct = 63/100 = 0.63

Pi+ = 0.3 x 0.57 = .171, 0.3 x 0.21 = .063, 0.4 x 0.22 = 0.88

Pc = Exp correct = 0.171 + 0.063 + 0.088 = 0.322

Po = Obs correct = 0.28 + 0.15 + 0.2 = 0.63 (Overall Accuracy)

26
3/5/14

Kappa Coefficient
n One of the advantages of using this method is
that we can statistically compare two
classification products.
For example, two classification maps can be made
using different algorithms and we can use the same
reference data to verify them.

Two K s can be derived, K 1, K2. For each K, the
variance can also be calculated.

Another Way
n The following shows an alternative way to
do the error matrix
Errorsof Omission and Commission are both
calculated from the row totals in this
technique

27
3/5/14

28
3/5/14

Ujian Tengah Semester


(UTS)

Waktu: lihat jadwal UTS


Pukul: lihat jadwal UTS
Sifat: Open/close books

S E L E S A I

SELAMAT BELAJAR

29