Sei sulla pagina 1di 42

Chapters 9 and 10, Longley et al.

Data Bases: Population and


Maintenance
Geog 176B Lecture 8
Data Collection
One of most expensive GIS activities
Many diverse sources (source integration,
data fusion, interoperability)
Two broad types of collection
Data capture (direct collection)
Data transfer
Two broad capture methods
Primary (direct measurement)
Secondary (indirect derivation)
Stages in Data Collection Projects
Planning
Preparation
Digitizing / Transfer
Editing / Improvement
Evaluation
Data Collection Techniques
Raster Vector
Primary
Digital remote
sensing images
GPS
measurements
Digital aerial
photographs
Survey
measurements
Secondary
Scanned maps Topographic
surveys
DEMs from maps Toponymy data
sets from atlases
Primary Data Capture
Capture specifically for GIS use
Raster remote sensing
e.g. SPOT and IKONOS satellites and aerial
photography
Passive and active sensors
Resolution is key consideration
Spatial
Spectral
Temporal
www.spot.ucsb.edu
Imagery for GIS
Vector Primary Data Capture
Surveying
Locations of objects determines by angle and
distance measurements from known locations
Uses expensive field equipment and crews
Most accurate method for large scale, small areas
GPS
Collection of satellites used to fix locations on
Earths surface
Differential GPS used to improve accuracy
Total Station
Pen/Portable PC and GPS
Secondary Geographic Data Capture
Data collected for other purposes can
be converted for use in GIS
Raster conversion
Scanning of maps, aerial photographs,
documents, etc
Important scanning parameters are spatial
and spectral (bit depth) resolution

Scanner
Raster to vector conversion
Vector Secondary Data Capture
Collection of vector objects from maps,
photographs, plans, etc.
Digitizing
Manual (table)
Heads-up and vectorization
Photogrammetry the science and
technology of making measurements from
photographs, etc.


Digitizer
Data Transfer
Buy vs. build is an important question
Many widely distributed sources of GI
Includes geocoding
Key catalogs include
Geodata.gov
Geography Network
Access technologies
Translation
Direct read
Managing Data Capture Projects
Key principles
Clear plan, adequate resources, appropriate
funding, and sufficient time
Fundamental tradeoff among
Quality, accuracy, speed and price
Two strategies
Incremental
Blitzkrieg
Alternative resource options
In house
Specialist external agency
Map scale

Ground distance corresponding
to 0.5 mm map distance

1:1250

62.5 cm

1:2500

1.25 m

1:5000

2.5 m

1:10,000

5 m

1:24,000

12 m

1:50,000

25 m

1:100,000

50 m

1:250,000

125 m

1:1,000,000

500 m

1:10,000,000

5 km



A useful rule of thumb is that positions measured from maps are
accurate to about 0.5 mm on the map. Multiplying this by the
scale of the map gives the corresponding distance on the
ground.

Positional Accuracy (cont.)
within a database a typical UTM coordinate
pair might be:
Easting 579124.349 m
Northing 5194732.247 m
If the database was digitized from a 1:24,000
map sheet, the last four digits in each
coordinate (units, tenths, hundredths,
thousandths) would be questionable
Testing Positional Accuracy
Use an independent source of higher accuracy:
find a larger scale map
use precision GPS
Use internal evidence:
digitized polygons that are unclosed, lines
that overshoot or undershoot nodes, etc. are
indications of error
sizes of gaps, overshoots, etc. may be a
measure of positional accuracy
Testing Accuracy (cont.)
Compute accuracy from knowledge of the
errors introduced by different sources
e.g., 1 mm in source document
0.5 mm in map registration for digitizing
0.2 mm in digitizing
if sources combine independently, we can get
an estimate of overall accuracy...
(1
2
+ 0.5
2
+ 0.2
2
)
0.5
= 1.14 mm

Definitions
Database an integrated set of data
(attributes) on a particular subject
Geographic (=spatial) database -
database containing geographic data of
a particular subject for a particular area
Database Management System (DBMS)
software to create, maintain and
access databases
A GIS links attribute and
spatial data
Attribute Data
Flat File
Relations
Map Data
Point File
Line File
Area File
Topology
Theme
Advantages of Databases over Files
Avoids redundancy and duplication
Reduces data maintenance costs
Faster for large datasets
Applications are separated from the data
Applications persist over time
Support multiple concurrent applications
Better data sharing
Security and standards can be defined and
enforced

Disadvantages of Databases over Files
Expense
Complexity
Performance especially complex data
types
Integration with other systems can be
difficult
Types of DBMS Model
Hierarchical
Network
Relational - RDBMS
Object-oriented - OODBMS
Object-relational - ORDBMS
Relational Databases rule now
Characteristics of DBMS (1)
Data model support for multiple data
types
e.g MS Access: Text, Memo, Number,
Date/Time, Currency, AutoNumber,
Yes/No, OLE Object (MS Object linking and
embedding), Hyperlink, Lookup Wizard
Load data from files, databases and
other applications
Index for rapid retrieval
Characteristics of DBMS (2)
Query language SQL
Security controlled access to data
Multi-level groups (e.g. census, NGA)
Controlled update using a transaction
manager
Versioning
Backup and recovery

Characteristics of DBMS (3)
Applications
Forms builder
Reportwriter
Internet Application Server
CASE tools
Programmable API (Applications
program interface)
Geographic
Information
System
Database
Management
System
Data load
Editing
Visualization
Mapping
Analysis
Storage
Indexing
Security
Query
Data
System
Task
Role of DBMS
Relational DBMS (1)
Data stored as tuples (tup-el),
conceptualized as tables
Table data about a class of objects
Two-dimensional list (array)
Rows = objects
Columns = object states (properties,
attributes)

Table
Row = object
Vector feature
Column = attribute
Relational DBMS (2)
Most popular type of DBMS
Over 95% of data in DBMS is in RDBMS
Commercial systems
IBM DB2
Informix
Microsoft Access
Microsoft SQL Server
Oracle
Sybase
SQL
Structured (Standard) Query Language
(pronounced SEQUEL)
Developed by IBM in 1970s
Now de facto and de jure standard for
accessing relational databases
Three types of usage
Stand alone queries
High level programming
Embedded in other applications
Types of SQL Statements
Data Definition Language (DDL)
Create, alter and delete data
CREATE TABLE, CREATE INDEX
Data Manipulation Language (DML)
Retrieve and manipulate data
SELECT, UPDATE, DELETE, INSERT
Data Control Languages (DCL)
Control security of data
GRANT, CREATE USER, DROP USER
Relational Join
Fundamental query operation
Occurs because
Data created/maintained by different users, but
integration needed for queries
Table joins use common keys (column
values)
Table (attribute) join concept has been
extended to geographic case
Join
Record
ID
Address
#cars
1241
123 State
St.
3
1242
1801 Main
St.
1
1243
2106 Elm
St.
2
1244
7262 Pine
Drive
1
1241 Ford 2003
1241 Subaru 2000
1241 Honda 1999
1241
123 State St.
Ford
1241
123 State St.
Subaru
1241
123 State St.
Honda
1242
1801 Elm St.
Kia
Spatial indexing
Many maps tiled
B-tree (Balanced)
Grid indexing
Quad tree: Points/regions
R-tree (Based on MBR)
New global/spatial grids: QTM
Go2 Grids
38:53:22.08N 077:02:06.86W
US.DC.WAS.54.18.28.83.11
US.CA.SBA.UCSB.UCEN
Spatial Search:
Gateway to Spatial Analysis
Overlay is a spatial retrieval operation
that is equivalent to an attribute join.
Buffering is a spatial retrieval around
points, lines, or areas based on
distance.

Potrebbero piacerti anche