Sei sulla pagina 1di 3

Susan T.

Williams
GEOG 484
Lesson 4: Design/Build GIS (Week 1)
Specifications and Justifications
Table: VoterReg
Field Name

Data Type

Voter_ID
PIN_Map
PIN_Block
PIN_Lot

Long integer
Text
Text
Text

Lengt
h
n/a
6
6
6

Allow NULL?

Default Values?

Key

no
no
no
no

none
none
none
0 (numeric)

*PK
no
no
no

Domain

none
none
none
none
Coded:
Ethnicity
Text
1
no
Not Specified
*FK
Ethnicity
Party
Text
1
no
Not Specified
*FK
Coded: Party
*While these could potentially serve as Primary or Foreign Keys that link to corresponding lookup
tables, I opted to create Coded Domains instead, thereby negating any need for Keys.

Coded Domain: Ethnicity


Code
I
A
B
H
W
O
N

Coded Domain: Party

Description
American Indian
Asian
Black
Hispanic
White
Other
Not Specified

Code
E
F
M
P
W
O
N

Description
Extremist
Federalist
Mugwumps
Purple
Whigs
Other
Not Specified

i.

Voter ID

ii.

I opted to use Long Integer for the VoterID field type. Although short integer was certainly a possible
consideration, the field would then have been limited to integer values between approximately +/- 32,
767 and I wanted to allow more expansion in the database than that just in case the voting district grows
or larger VoterID numbers are assigned. This field could also have been a Text field, since the numbers
are not going to be used in calculations, but that would ultimately require a greater storage space so I
felt that Long integer was the best fit for this situation.
Field length specification is not applicable in this instance.
I chose not to allow Null values. Since the VoterID is the main field that identifies each unique record, it
would be fairly counterproductive to allow null values in such a field.
Default values would be useless here also as no two IDs will be the same, so I did not allow that option.
The VoterID field could potentially serve as the Primary Key, seeing as there should be just one unique
value in this field for each individual voter (unlike the tax parcel ID, in which there might be multiple
voters residing on one parcel). However, the need for a Primary Key is unnecessary in this instance as I
created Coded Domains instead and thus there are no other tables with which to link.

iii.
iv.
v.
vi.

i.

ii.

PIN (tax parcel ID) now PIN_Map, PIN_Block and PIN_Lot


It seemed to me that the three portions of PIN value in this field (Map, Block, and Lot) made it a multipart field that would complicate querying. Therefore I replaced the single PIN field with the three
separate fields instead: PIN_Map, PIN_Block, and PIN_Lot.
I selected Text type for all three fields to accommodate the combination of letters and numbers
(although there is no need for the hyphen character now that it has been split into three separate fields).

iii.

iv.

v.

Like the VoterID field, I wanted to give the database some room for expansion in case future tax parcel
ID numbers are lengthier or formatted differently, so I set the length of each of these three fields to 6
(which seems to be approximately double the needed length of each current field) rather than accepting
the default of 50.
My choice to not allow Null values for any of these three fields is based on the fact that one of the goals
of the Whig party is to find areas of the city with the lowest percentages of registered voters. Since the
tax parcel ID helps identify the part of the city that each voter is located in, these fields become rather
necessary in regards to spatial analysis.
I chose to not allow Default values in two of these three fields, as very few of these particular records
will be duplicate except in the case of the Lot number. My reason for allowing a default in the PIN_Lot
field was mainly to reduce data entry error and lessen the effort required by the summer intern. The
predominant Lot value (in our given sample) is a numeric zero, which could very easily be confused with
the letter O. We cannot avoid this potential mistake by setting the field type to hold only numeric
values because some of the Lot IDs consist of alpha characters, so our best chance for reducing effort
and error here is by setting a numeric zero as a default.

i.

Ethnicity

ii.

Since there are a small number of options for this field, it is a good candidate for a Coded Domain,
which will reduce database storage and help reduce the chance of data entry error. I selected Text for
the field type so that the summer intern (or whomever) can simply type the primary letter of each
ethnicity and automatically select the appropriate response. (Two of our choices, however, begin with
the same first letter: American Indian and Asian. I found out the hard way that ArcMap keys based on
the Description rather than by the single-letter code I specified when the Table Options are set to
display Descriptions. However, if I turn this option off, although the keystroke coding then works
perfectly, the whole-word descriptions are not displayed in the field and all we see are single-letter
codes. This seems like it could lead to more effort and greater chance of error, particularly in instances
where we have a W showing in two columns one for White and one for Whigs - so I opted to turn
the Description display back on and utilize the dropdown menu rather than keystrokes. I wish ESRI
would program the software to key based on the actual code the user specifies without having to turn off
the Descriptions to do so, but thats just my humble opinion)
I used a short integer length of 1, which is all that is currently needed for our existing list of options, and
will likely be enough to handle any possible future expansion of the database as well. If other
categories need to be added at some point, perhaps a Two or more group, for example, there should
still be plenty of choices from remaining alpha characters.
While I did not allow Null values for this field (seeing as all possible options are already coded for), I did
add a Not Specified option to the Coded Domain for those voters who wish to keep that information
private, or in instances where the voter registration information may not be legible (as did occur on our
given sample).
I decided the Not Specified option would be a suitable default value also. While the most common
ethnicity listed in our particular sample is White, that may not hold true over the rest of the city, so I did
not feel that using White as default would be very useful in the grand scheme of things.
The Ethnicity field could serve as a foreign key to be used in linking with its corresponding look-up table
if I opted to utilize lookup tables rather than Coded Domains.

iii.

iv.

v.
vi.

i.
ii.

iii.

iv.

Party
Like the Ethnicity field, this field also consists of a few limited options and is therefore well-suited to a
Coded Domain. I selected Text for the field type so that the summer intern (or whomever) can simply
type the primary letter of each party and automatically select the appropriate response or utilize the
dropdown menu.
I used a short integer length of 1, which is all that is currently needed for our existing list of options, and
will likely be enough to handle any possible future expansion of the database as well. If other
categories need to be added at some point, there should still be plenty of choices from remaining alpha
characters.
While I did not allow Null values for this field (seeing as all possible options are already coded for), I did
add a Not Specified option to the Coded Domain for those voters who wish to keep that information
private, or in instances where the voter registration information may not be legible (as did occur on our
given sample).

v.

vi.

I decided the Not Specified option would be a suitable default value also. While the most common
party affiliation listed in our particular sample is Mugwumps, that may not hold true over the rest of the
city, so I did not feel that using any particular party as default would be very useful in the grand scheme
of things.
The Party field could serve as a foreign key to be used in linking with its corresponding look-up table if I
opted to utilize lookup tables rather than Coded Domains.

In regards to good database design:


i.

Minimize use of storage space


Its rather like walking a tightrope on the one hand, you want to save as much storage space as
possible. On the other hand, you want to allow sufficient expansion for the database to grow without
necessitating a complete overhaul. By keeping tables to a minimum and utilizing Coded Domains, I was
able to avoid redundant data. Also, using Integer rather than Text for the Voter_ID field type saves
some storage space, as did limiting the length for all the fields rather than accepting the default of 50.

ii.

Lessen the effort of the summer intern to enter the data


By offering Coded Domains, the summer intern is given a dropdown menu to select from and merely
has to type one letter for both Ethnicity and Party fields or choose from a dropdown menu. Allowing
Not Specified as default for these fields should also lessen some effort, and the numeric zero default of
the PIN_Lot Field should also lessen a great deal of effort.

iii.

Decrease the chance of data entry mistakes


Again, the Coded Domains reduce data entry mistakes as the options are already coded for and
standardized. Offering a numeric zero default of the PIN_Lot Field should greatly reduce errors that
come from commonly mistaking the number zero for the letter O. Also, by not offering Default values on
every single field, we are not encouraging laziness but are rather forcing the summer intern to pay
attention to what he or she is entering. I would also suggest turning off the ObjectID field to eliminate
the chance of the intern entering information into the incorrect field.

iv.

Allow for easy and efficient querying within ArcMap


By splitting the PIN (tax parcel ID) into three separate fields, the complications of multi-part fields can be
avoided and we gain the ability to query based on just a portion of the criteria (if we wanted to run a
search for everyone on a specific Block, for example). The standardized choices that result from using
Coded Domains will also enhance querying, helping avoid any differences in spelling or capitalization,
for example. Disallowing null values should also allow for more efficient queries.

GENERAL REFERENCES
Sloan, J. (1999-2012). GIS Database Development, Lesson 4. The Pennsylvania State University World
Campus Certificate Program in GIS. Accessed February 2012 from https://www.eeducation.psu.edu/geog484/l4.html
ArcMap v10.0 GIS software by Environmental Systems Research Institute (ESRI).

This document is published in fulfillment of an assignment by a student enrolled in an educational


offering of the Pennsylvania State University. The student, named above, retains all rights to the
document and responsibility for its accuracy and originality.

Potrebbero piacerti anche