Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
M. Tamer zsu
David R. Cheriton School of Computer Science University of Waterloo
CS 348
Fall 2012
1 / 29
Notes
What is Data?
ANSI denition of data: 1 A representation of facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans or by automatic means. 2 Any representation such as characters or analog quantities to which meaning is or might be assigned. Generally, we perform operations on data or data items to supply some information about an entity. Volatile vs persistent data Our concern is primarily with persistent data
CS 348
Fall 2012
2 / 29
Notes
DATA SET 1!
PROGRAM 2!
Data! Management!
DATA SET 2!
PROGRAM 3!
Data! Management!
DATA SET 3!
CS 348
Fall 2012
3 / 29
Notes
PROGRAM 2!
Data! Management!
PROGRAM 3!
Data! Management!
CS 348
Redundant Data!
Fall 2012
4 / 29
Notes
Database Approach
DBMS
Query Processor
PROGRAM 1
Transaction Mgr
Integrated Database
PROGRAM 2
CS 348
Fall 2012
5 / 29
Notes
What is a Database?
Denition (Database)
A large and persistent collection of (more-or-less similar) pieces of information organized in a way that facilitates ecient retrieval and modication . The structure of the database is determined by the abstract data model that is used Examples:
a le cabinet a library system a personnel management system
Notes
Data Model
schema
=)
File systems can at best specify data organization within one le Alternatives for business data Hierarchical; network Relational Object-oriented (more recently object-relational)
CS 348
Fall 2012
7 / 29
Notes
Denition (Schema)
A schema is a description of the data interface to the database (i.e., how the data is organized).
Denition (Instance)
A database instance is a database (real data) that conforms to a given schema. A schema can have many instances.
CS 348
Fall 2012
8 / 29
Notes
Example Relational
Schema EMP(ENO, ENAME, TITLE) PROJ(PNO, PNAME, BUDGET) WORKS(ENO, PNO, RESP, DUR) Instance
EMP ENO E1 E2 E3 E4 E5 E6 E7 E8 PROJ PNO P1 P2 P3 P4 PNAME Instrumentation Database Develop. CAD/CAM Maintenance BUDGET 150000 135000 250000 310000 ENAME J. Doe M. Smith A. Lee J. Miller B. Casey L. Chu R. Davis J. Jones TITLE Elect. Eng Syst. Anal. Mech. Eng. Programmer Syst. Anal. Elect. Eng. Mech. Eng. Syst. Anal. WORKS ENO PNO E1 E2 E2 E3 E3 E4 E5 E6 E7 E8 P1 P1 P2 P3 P4 P2 P2 P4 P3 P3 RESP Manager Analyst Analyst Consultant Engineer Programmer Manager Manager Engineer Manager DUR 12 24 6 10 48 18 24 48 36 40
CS 348
Fall 2012
9 / 29
Notes
Application of Databases
Original
inventory control payroll banking and nancial systems reservation systems computer aided design (CAD) software development (CASE, SDE/SSE) telecommunication systems e-commerce dynamic/personalized web content
More recent
CS 348
Fall 2012
10 / 29
Notes
Common Circumstances:
There is lots of data (mass storage) Data is formatted Requirements: persistence and reliability ecient and concurrent access Issues: many les with dierent structure shared les or replicated data need to exchange data (translation programs)
CS 348
Fall 2012
11 / 29
Notes
Data constitute an organizational asset Integrated control Reduction of redundancy Avoidance of inconsistency Data integrity Sharability Improved security Programmer productivity Data Independence Programmers do not have to deal with data organization
=)
=)
CS 348
Fall 2012
12 / 29
Notes
Programmer Productivity
High data independence
Maintenance Time
CS 348
Fall 2012
13 / 29
Notes
2000 BC: Sumerian Records 350 BC: Syllogisms (Aristotle) 296 BC: Library of Alexandria 1879: Modern Logic (Frege) 1884: U.S. Census (Hollerith) 1941: Model Theory (Tarski)
CS 348
Fall 2012
14 / 29
Notes
CS 348
Fall 2012
15 / 29
Notes
Notes
Notes
Data Model all data stored in a well dened way Access control only authorized people get to see/modify it Concurrency control multiple concurrent applications access data Database recovery nothing gets accidentally lost Database maintenance
CS 348 Overview of Database Management Fall 2012 18 / 29
Notes
Edgar Codd proposes relational data model (1970) rm mathematical foundation declarative queries Charles Bachman wins ACM Turing award (1973) The Programmer as Navigator
Peter Chen proposes E-R model (1976) Transaction concepts (Jim Gray and others) IBMs System R and UC Berkeleys Ingres systems demonstrate
CS 348
Fall 2012
19 / 29
Notes
Denition (Schema)
A schema is a description of the data interface to the database (i.e., how the data is organized).
1
External schema (view): what the application programs and user see. May dier for dierent users of the same database. Conceptual schema: description of the logical structure of all data in the database. Physical schema: description of physical aspects (selection of les, devices, storage algorithms, etc.)
CS 348
Fall 2012
20 / 29
Notes
DBMS
Database
CS 348
Fall 2012
21 / 29
Notes
Data Independence
Idea
Applications do not access data directly but, rather through an abstract data model provided by the DBMS. Two kinds of data independence: Physical: applications immune to changes in storage structures Logical: applications immune to changes in data organization
Note
One of the most important reasons to use a DBMS!
CS 348
Fall 2012
22 / 29
Notes
internal schema
CS 348
Fall 2012
23 / 29
Notes
query-generating applications, or
Application developer:
Designs and implements applications that access the database.
Notes
Transactions
When multiple applications access the same data, undesirable results occur. Example:
withdraw(AC,1000) Bal := getbal(AC) if (Bal>1000) <give-money> setbal(AC,Bal-1000) withdraw(AC,500) Bal := getbal(AC) if (Bal>500) <give-money> setbal(AC,Bal-500)
Idea
Every application may think it is the sole application accessing the data. The DBMS should guarantee correct execution.
CS 348
Fall 2012
25 / 29
Notes
Transactions (contd)
Denition (Transaction)
An application-specied atomic and durable unit of work. Properties of transactions ensured by the DBMS: Atomic: a transaction occurs entirely, or not at all Consistency: Isolated: Durable: each transaction preserves the consistency of the database concurrent transactions do not interfere with each other once completed, a transactions changes are permanent
CS 348
Fall 2012
26 / 29
Notes
Development of commercial relational technology IBM DB2, Oracle, Informix, Sybase Edgar Codd wins ACM Turing award (1981) SQL standardization eorts through ANSI and ISO Object-oriented DBMSs (late 1980s to middle of 1990s) persistent objects object ids, methods, inheritence navigational interface reminicent of hierarchical model
CS 348
Fall 2012
27 / 29
Notes
Continued expansion of SQL and system capabilities New application areas: the Internet On-Line Analytic Processing (OLAP) data warehousing embedded systems multimedia XML data streams Jim Gray wins ACM Turing award (1998) Relational DBMSs incorporate objects (late 1990s)
CS 348
Fall 2012
28 / 29
Notes
Summary
CS 348
Fall 2012
29 / 29
Notes