Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Contents
[hide!
.- ;inimi%e redesign when e5tending the database structure .0 ;a&e the data model more informative to users .3 <void bias towards any particular pattern of querying
3 7enormali%ation
o
9 #urther reading 6 'otes and references + .ee also > "5ternal lin&s
Jones
/o each customer there corresponds a repeating group of transactions. /he automated evaluation of any query relating to customersC transactions therefore would broadly involve two stages: . Enpac&ing one or more customersC groups of transactions allowing the individual transactions in a group to be e5amined, and -. 7eriving a query result based on the results of the first stage #or e5ample, in order to find out the monetary sum of all transactions that occurred in ?ctober -,,0 for all customers, the system would have to &now that it must first unpac& the Transactions
group of each customer, then sum the Amounts of all transactions thus obtained where the Date of the transaction falls in ?ctober -,,0. ?ne of $oddCs important insights was that this structural comple5ity could always be removed completely, leading to much greater power and fle5ibility in the way queries could be formulated (by users and applications) and evaluated (by the 72;.). /he normali%ed equivalent of the structure above would loo& li&e this: Customer Jones Jones Dil&ins .tevens .tevens .tevens r! "D ->*, -*,3 ->*> -*,+ Date #mount
'ow each row represents an individual credit card transaction, and the 72;. can obtain the answer of interest, simply by finding all rows with a 7ate falling in ?ctober, and summing their <mounts. <ll of the values in the data structure are on an equal footing: they are all e5posed to the 72;. directly, and can directly participate in queries, whereas in the previous situation some values were embedded in lower-level structures that had to be handled specially. <ccordingly, the normali%ed design lends itself to general-purpose query processing, whereas the unnormali%ed design does not. /he ob@ectives of normali%ation beyond '# were stated as follows by $odd: . /o free the collection of relations from undesirable insertion, update and deletion dependencies= -. /o reduce the need for restructuring the collection of relations as new types of data are introduced, and thus increase the life span of application programs= 0. /o ma&e the relational model more informative to users= 3. /o ma&e the collection of relations neutral to the query statistics, where these statistics are liable to change as time goes by. ".#. $odd, :#urther 'ormali%ation of the 7ata 2ase 1elational ;odel:[ ! /he sections below give details of each of these ob@ectives.
<n update anomaly. "mployee 9 * is shown as having different addresses on different records.
<n insertion anomaly. Entil the new faculty member, 7r. 'ewsome, is assigned to teach at least one course, his details cannot be recorded.
< deletion anomaly. <ll information about 7r. Fiddens is lost when he temporarily ceases to be assigned to any courses. Dhen an attempt is made to modify (update, insert into, or delete from) a table, undesired sideeffects may follow. 'ot all tables can suffer from these side-effects= rather, the side-effects can only arise in tables that have not been sufficiently normali%ed. <n insufficiently normali%ed table might have one or more of the following characteristics:
/he same information can be e5pressed on multiple rows= therefore updates to the table may result in logical inconsistencies. #or e5ample, each record in an :"mployeesC .&ills: table might contain an "mployee I7, "mployee <ddress, and .&ill= thus a change of address for a particular employee will potentially need to be applied to multiple records (one for each of his s&ills). If the update is not carried through successfullyif, that is, the employeeCs address is updated on some records but not othersthen the table is left in an inconsistent state. .pecifically, the table provides conflicting answers to the question of what this particular employeeCs address is. /his phenomenon is &nown as an update anomaly. /here are circumstances in which certain facts cannot be recorded at all. #or e5ample, each record in a :#aculty and /heir $ourses: table might contain a #aculty I7, #aculty 'ame, #aculty 4ire 7ate, and $ourse $odethus we can record the details of any faculty member who teaches at least one course, but we cannot record the details of a
newly-hired faculty member who has not yet been assigned to teach any courses. /his phenomenon is &nown as an insertion anomaly.
/here are circumstances in which the deletion of data representing certain facts necessitates the deletion of data representing completely different facts. /he :#aculty and /heir $ourses: table described in the previous e5ample suffers from this type of anomaly, for if a faculty member temporarily ceases to be assigned to any courses, we must delete the last of the records on which that faculty member appears, effectively also deleting the faculty member. /his phenomenon is &nown as a deletion anomaly.
In a given table, an attribute Y is said to have a functional dependency on a set of attributes X (written X H Y) if and only if each X value is associated with precisely one Y value. #or e5ample, in an :"mployee: table that includes the attributes :"mployee I7: and :"mployee 7ate of 2irth:, the functional dependency I"mployee I7J H I"mployee 7ate of 2irthJ would hold. rivial functional dependency < trivial functional dependency is a functional dependency of an attribute on a superset of itself. I"mployee I7, "mployee <ddressJ H I"mployee <ddressJ is trivial, as is I"mployee <ddressJ H I"mployee <ddressJ. #ull functional dependency <n attribute is fully functionally dependent on a set of attributes K if it is functionally dependent on K, and not functionally dependent on any proper subset of K. I"mployee <ddressJ has a functional dependency on I"mployee I7, .&illJ, but not a full functional dependency, because it is also dependent on I"mployee I7J. /ransitive dependency < transitive dependency is an indirect functional dependency, one in which XHZ only by virtue of XHY and YHZ. ;ultivalued dependency < multivalued dependency is a constraint according to which the presence of certain rows in a table implies the presence of certain other rows. Join dependency < table T is sub@ect to a @oin dependency if T can always be recreated by @oining multiple tables each having a subset of the attributes of T. .uper&ey < super&ey is an attribute or set of attributes that uniquely identifies rows within a table= in other words, two distinct rows are always guaranteed to have distinct super&eys. I"mployee I7, "mployee <ddress, .&illJ would be a super&ey for the :"mployeesC .&ills: table= I"mployee I7, .&illJ would also be a super&ey. $andidate &ey < candidate &ey is a minimal super&ey, that is, a super&ey for which we can say that no proper subset of it is also a super&ey. I"mployee Id, .&illJ would be a candidate &ey for the :"mployeesC .&ills: table. 'on-prime attribute < non-prime attribute is an attribute that does not occur in any candidate &ey. "mployee <ddress would be a non-prime attribute in the :"mployeesC .&ills: table. Lrimary &ey ;ost 72;.s require a table to be defined as having a single unique &ey, rather than a number of possible unique &eys. < primary &ey is a &ey which the database designer has designated for this purpose.
form applicable to a table, the less vulnerable it is to inconsistencies and anomalies. "ach table has a :hi%hest normal form: (,+F): by definition, a table always meets the requirements of its 4'# and of all normal forms lower than its 4'#= also by definition, a table fails to meet the requirements of any normal form higher than its 4'#. /he normal forms are applicable to individual tables= to say that an entire database is in normal form n is to say that all of its tables are in normal form n. 'ewcomers to database design sometimes suppose that normali%ation proceeds in an iterative fashion, i.e. a '# design is first normali%ed to -'#, then to 0'#, and so on. /his is not an accurate description of how normali%ation typically wor&s. < sensibly designed table is li&ely to be in 0'# on the first attempt= furthermore, if it is 0'#, it is overwhelmingly li&ely to have an 4'# of 9'#. <chieving the :higher: normal forms (above 0'#) does not usually require an e5tra e5penditure of effort on the part of the designer, because 0'# tables usually need no modification to meet the requirements of these higher normal forms. /he main normal forms are summari%ed below. +ormal form Defined by )rief definition #irst normal form /wo versions: ".#. $odd ( *+,), $.J. /able faithfully represents a relation ( '#) 7ate (-,,0)[ -! and has no repeating groups 'o non-prime attribute in the table is .econd normal ".#. $odd ( *+ )[ 0! functionally dependent on a part form (-'#) (proper subset) of a candidate &ey [ 3! ".#. $odd ( *+ ) = see Malso $arlo "very non-prime attribute is non/hird normal form NanioloCs equivalent but differently- transitively dependent on every &ey of (0'#) e5pressed definition ( *>-)[ 9! the table 2oyce-$odd "very non-trivial functional 1aymond #. 2oyce and ".#. $odd normal form dependency in the table is a dependency ( *+3)[ 6! (2$'#) on a super&ey "very non-trivial multivalued #ourth normal [ +! 1onald #agin ( *++) dependency in the table is a dependency form (3'#) on a super&ey "very non-trivial @oin dependency in #ifth normal form 1onald #agin ( *+*)[ >! the table is implied by the super&eys of (9'#) the table 7omainO&ey "very constraint on the table is a logical normal form 1onald #agin ( *> )[ *! consequence of the tableCs domain (7P'#) constraints and &ey constraints /able features no non-trivial @oin .i5th normal form $.J. 7ate, 4ugh 7arwen, and 'i&os dependencies at all (with reference to (6'#) 8orent%os (-,,-)[9! generali%ed @oin operator)
[edit] Denormalization
;ain article: 7enormali%ation 7atabases intended for online transaction processing (?8/L) are typically more normali%ed than databases intended for online analytical processing (?8<L). ?8/L applications are characteri%ed by a high volume of small transactions such as updating a sales record at a supermar&et chec&out counter. /he e5pectation is that each transaction will leave the database in a consistent state. 2y contrast, databases intended for ?8<L operations are primarily :read mostly: databases. ?8<L applications tend to e5tract historical data that has accumulated over a long period of time. #or such databases, redundant or :denormali%ed: data may facilitate business intelligence applications. .pecifically, dimensional tables in a star schema often contain denormali%ed data. /he denormali%ed or redundant data must be carefully controlled during e5tract, transform, load ("/8) processing, and users should not be permitted to see the data until it is in a consistent state. /he normali%ed alternative to the star schema is the snowfla&e schema. In many cases, the need for denormali%ation has waned as computers and 172;. software have become more powerful, but since data volumes have generally increased along with hardware and software performance, ?8<L databases often still use denormali%ed schemas. 7enormali%ation is also used to improve performance on smaller computers as in computeri%ed cash-registers and mobile devices, since these may use the data for loo&-up only (e.g. price loo&ups). 7enormali%ation may also be used when no 172;. e5ists for a platform (such as Lalm), or no changes are to be made to the data and a swift response is crucial.
<ssume a person has several favorite colors. ?bviously, favorite colors consist of a set of colors modeled by the given table. /o transform a '# into an '#A table a :nest: operator is required which e5tends the relational algebra of the higher normal forms. <pplying the :nest: operator to the '# table yields the following '#A table: 'on-#irst 'ormal #orm 2erson Favorite Colors Favorite Color 2ob blue red Favorite Color green Jane yellow red /o transform this '#A table bac& into a '# an :unnest: operator is required which e5tends the relational algebra of the higher normal forms (one would allow :colors: to be its own table). <lthough :unnest: is the mathematical inverse to :nest:, the operator :nest: is not always the mathematical inverse of :unnest:. <nother constraint required is for the operators to be bi@ective, which is covered by the Lartitioned 'ormal #orm (L'#).
8ittCs /ips: 'ormali%ation 7ate, $. J. ( ***), An Introduction to Database Systems (>th ed.). <ddison-Desley 8ongman. I.2' ,-0- - *+>3-3. Pent, D. ( *>0) A Simple Guide to Five Normal Forms in elational Database T!eory, $ommunications of the <$;, vol. -6, pp. -,Q -9 7ate, $.J., R 7arwen, 4., R Lascal, #. Database Debun"ings 4.-J. .che&, L. Listor 7ata .tructures for an Integrated 7ata 2ase ;anagement and Information 1etrieval .ystem
0. 3 $odd, ".#. :#urther 'ormali%ation of the 7ata 2ase 1elational ;odel.: (Lresented at $ourant $omputer .cience .ymposia .eries 6, :7ata 2ase .ystems,: 'ew Sor& $ity, ;ay -3th--9th, *+ .) I2; 1esearch 1eport 1J*,* (<ugust 0 st, *+ ). 1epublished in 1andall J. 1ustin (ed.), Data (ase Systems$ 'ourant 'omputer Science Symposia Series ). Lrentice-4all, *+-.
4. 3 $odd, ". #. :1ecent Investigations into 1elational 7ata 2ase .ystems.: I2; 1esearch
1eport 1J 0>9 (<pril -0rd, *+3). 1epublished in *roc+ ,-./ 'ongress (.toc&holm, .weden, *+3). 'ew Sor&, '.S.: 'orth-4olland ( *+3). 9. T a b $.J. 7ate, 4ugh 7arwen, 'i&os 8orent%os. Temporal Data and t!e elational #odel. ;organ Paufmann (-,,-), p. +6 6. 3 $.J. 7ate. An Introduction to Database Systems. <ddison-Desley ( ***), p. -*, +. 3 $hris 7ate, for e5ample, writes: :I believe firmly that anything less than a fully normali%ed design is strongly contraindicated ... [S!ou should 0denormali1e0 only as a last resort. /hat is, you should bac& off from a fully normali%ed design only if all other strategies for improving performance have somehow failed to meet requirements.: 7ate, $.J. Database in Dept!$ elational T!eory for *ractitioners. ?C1eilly (-,,9), p. 9-. >. 3 1alph Pimball, for e5ample, writes: :/he use of normali%ed modeling in the data warehouse presentation area defeats the whole purpose of data warehousing, namely, intuitive and high-performance retrieval of data.: Pimball, 1alph. T!e Data 2are!ouse Tool"it3 &nd 4d+. Diley $omputer Lublishing (-,,-), p. . *. 3 :/he adoption of a relational model of data ... permits the development of a universal data sub-language based on an applied predicate calculus. < first-order predicate calculus suffices if the collection of relations is in first normal form. .uch a language would provide a yardstic& of linguistic power for all other proposed data languages, and would itself be a strong candidate for embedding (with appropriate syntactic modification) in a variety of host Ianguages (programming, command- or problem-oriented).: $odd, :< 1elational ;odel of 7ata for 8arge .hared 7ata 2an&s:, p. 0> ,. 3 $odd, ".#. $hapter -0, :.erious #laws in .B8:, in T!e elational #odel for Database #anagement$ %ersion &. <ddison-Desley ( **,), p. 0+ -0>* . 3 $odd, ".#. :#urther 'ormali%ation of the 7ata 2ase 1elational ;odel:, p. 03 -. 3 7ate, $. J. :Dhat #irst 'ormal #orm 1eally ;eans: in Date on Database$ 2ritings &5556&55) (.pringer-Uerlag, -,,6), pp. -+- ->. 0. 3 $odd, ".#. :#urther 'ormali%ation of the 7ata 2ase 1elational ;odel.: (Lresented at $ourant $omputer .cience .ymposia .eries 6, :7ata 2ase .ystems,: 'ew Sor& $ity, ;ay -3--9, *+ .) I2; 1esearch 1eport 1J*,* (<ugust 0 st, *+ ). 1epublished in 1andall J. 1ustin (ed.), Data (ase Systems$ 'ourant 'omputer Science Symposia Series ). Lrentice-4all, *+-.
3. 3 $odd, ".#. :#urther 'ormali%ation of the 7ata 2ase 1elational ;odel.: (Lresented at $ourant $omputer .cience .ymposia .eries 6, :7ata 2ase .ystems,: 'ew Sor& $ity, ;ay -3--9, *+ .) I2; 1esearch 1eport 1J*,* (<ugust 0 , *+ ). 1epublished in 1andall J. 1ustin (ed.), Data (ase Systems$ 'ourant 'omputer Science Symposia Series ). Lrentice-4all, *+-. 9. 3 Naniolo, $arlo. :< 'ew 'ormal #orm for the 7esign of 1elational 7atabase .chemata.: A'# Transactions on Database Systems +(0), .eptember *>-. 6. 3 $odd, ". #. :1ecent Investigations into 1elational 7ata 2ase .ystems.: I2; 1esearch 1eport 1J 0>9 (<pril -0, *+3). 1epublished in *roc+ ,-./ 'ongress (.toc&holm, .weden, *+3). 'ew Sor&, '.S.: 'orth-4olland ( *+3). +. 3 #agin, 1onald (.eptember *++). :;ultivalued 7ependencies and a 'ew 'ormal #orm for 1elational 7atabases:. A'# Transactions on Database Systems 5 ( ): -6+. doi: ,. 39O0-,99+.0-,9+ . http:OOwww.almaden.ibm.comOcsOpeopleOfaginOtods++.pdf. >. 3 1onald #agin. :'ormal #orms and 1elational 7atabase ?perators:. <$; .IF;?7 International $onference on ;anagement of 7ata, ;ay 0 -June , *+*, 2oston, ;ass. <lso I2; 1esearch 1eport 1J-3+ , #eb. *+*.
19. 3 1onald #agin ( *> ) A Normal Form for elational Databases T!at Is (ased on
Laper: :'on #irst 'ormal #orm 1elations: by F. Jaesch&e, 4. -J .che& = I2; 4eidelberg .cientific $enter. -V Laper studying normali%ation and denormali%ation operators nest and unnest as mildly described at the end of this wi&i page.
<spect (computer science) 2usiness rule $anonical form $ross-cutting concern ?ptimi%ation (computer science) 1efactoring
7atabase 'ormali%ation 2asics by ;i&e $happle (<bout.com) 7atabase 'ormali%ation Intro, Lart <n Introduction to 7atabase 'ormali%ation by ;i&e 4illyer.
'ormali%ation by I/., Eniversity of /e5as. < tutorial on the first 0 normal forms by #red $oulson 72 'ormali%ation "5amples 7escription of the database normali%ation basics by ;icrosoft 7atabase 'ormali%ation and 7esign /echniques by 2arry Dise, recommended reading for the 4arvard ;I.. [sho&]
vWdWe
[sho&]
vWdWe
+amespaces
<rticle 7iscussion
6earch
.earch
+avi%ation
"nteraction oolbo'
<bout Di&ipedia $ommunity portal 1ecent changes $ontact Di&ipedia 7onate to Di&ipedia 4elp
Dhat lin&s here 1elated changes Epload file .pecial pages Lermanent lin& $ite this page
2rint9e'port
:an%ua%es
Z[\]^_` aes&y 7eutsch "spabol #rancais Italiano defgh 'ederlands 'ors& (bo&mil) Lols&i Lortugujs klmmnop .imple "nglish .lovenqina rstmno O .rps&i .uomi .vens&a /ur&ce vnswxymznw /i{ng Ui|t /his page was last modified on 6 June -, , at ,*: >. /e5t is available under the $reative $ommons <ttribution-.hare<li&e 8icense= additional terms may apply. .ee /erms of Ese for details.