Sei sulla pagina 1di 21

Information Management Ecosystem Partnerships

IBM Canada Lab


Summer/Fall 2010

The Relational Data Model

Information Management

2010 IBM Corporation

Information Management

Agenda

Data and Databases

Database Management Systems (DBMS)

Information Model & Data Model

Relational Data Model & Relational Databases

Normalization

2010 IBM Corporation

Information Management

What is Data?

Collection of facts or numbers

Can be quantitative or qualitative

Describes a variable or set of variables

Essentially data can be thought of as the result of


observations based on things like:
measurements
statistics

2010 IBM Corporation

Information Management

Data and Information

Data is simply facts

Why is data important?


By relating different pieces of data
we are able to extract valuable
information by presenting it in
meaningful context

For that we need to be able to:

Store data so it can be persisted


Structure data so it is easier to manipulate
Organize data in a meaningful way
Process data to derive data value from it

2010 IBM Corporation

Information Management

Why Databases?

Data can be stored using multiple methods such as:


Text files
Comma delimited data files
Spreadsheets
Databases

Why database?
The way data is accessed
The way data is handled

2010 IBM Corporation

Information Management

Managing Data

Using a database provides:


a standard interface for accessing data
multiple users with simultaneous ability to insert, update and
delete data
changes to the data without risk of losing data and its
consistency
the capability to handle huge volumes of data and users
tools for data backup, restore and recovery
security
reduce redundancy
data independence

2010 IBM Corporation

Information Management

Database Management Systems

Database Management System (DBMS)


It is the software system that manages databases
Provides an interface of access to the databases
Provides data services to applications
Efficient data querying and update mechanisms
Data integrity guarantees data is always right even in case
of software and hardware errors
Others: backup, compression, security, replication, etc.

DB2 is a Database Management System


2010 IBM Corporation

Information Management

Information Model

Information Model
Abstract management of objects at a conceptual level
Independent of specific implementations and protocols
Hides all protocol and implementation details
Defines relationships between managed objects.

Multiple implementations of an information model exists


Data models

2010 IBM Corporation

Information Management

Data Model

A model is a representation of an object or concept of the


real world
3D model: a graphical representation of an object
Scale model: a replica or copy of an object in a smaller size
Business model: describes how a company operates

Data Model
Define how data is to be represented and structured
It can be used to map how data from the real world is to be
represented in a software system

Characteristics of a Data Model


Lower level of abstraction
Intended for the software developer
Includes specific implementation and protocol details

2010 IBM Corporation

Information Management

Types of Data Models

10

Extended Relational

Entity-Relationship

Hierarchical

Network

Object-oriented

Object-relational

Relational

Semantic

Semi-structured (XML)

2010 IBM Corporation

Information Management

The Relational Data Model

11

Proposed by E.F. Codd in 1970.

It is mathematical model that describes data as a collection


of Relations (sets) and the values of the data is defined by
Domains.

Focuses on providing better data independence

Data are operated upon by means of a relational calculus or


relational algebra

Advantages
Based on a formal theoretical model and proven in practice
Provides logical view of the data

It is implemented by most DBMS in the market, such as DB2.


There are called Relational Database Management
Systems
2010 IBM Corporation

Information Management

Components of the Relational Data Model

The relational data model has its own unique terms used to
define its concepts.
Domain
Attribute

Relation

(ID, int)

(NAME, text)

(EXT, int)

(Active, boolean)

John S

54213

Michael B

52137

Jeremy W

50603

Leah E

58963

Tuple

Domain (or data type) defines the set of possible values that data can assume

Relation is composed by a heading and a body


Heading: a set of attributes
Body: a set of tuples

12

Attribute is composed by a name and a domain (type)

A tuple is a set of attribute values


2010 IBM Corporation

Information Management

Components of a Relational Database

Concepts from Relational Data Model can be mapped to their


implementation found in a Relational Database

Relational databases store data using tables


A table consists of columns and rows
Each column has a specific data type
Each row features a certain value for each column
Column

Table

13

ID

NAME

EXTENSION

MANAGER

John S

54213

Susan P

59867

Andrew J

55935

Michael B

52137

Jeremy W

50603

Leah E

58963

Row
2010 IBM Corporation

Information Management

Tables

A Table is the counterpart of a Relation from the Relational


Data Model

A table stores data in rows and columns


Rows are the same as Tuples
Columns are the same as Attributes

There can be multiple tables for different types of data to


reduce redundant information
Normalization (more on this later)

For example:
You want to store data about a company
Data about branch offices will be stored in a table
Employee data for specific branches will be stored in its own
table
Product data will be stored in another table

14

2010 IBM Corporation

Information Management

Columns

Columns are also known as


fields

Each field contains a


specific type of information
such as name, extension,
position and so on

15

Columns must be
designated a specific
data type such as
DATE, VARCHAR, INTEGER
and so on

Specific
Information

Data type:
Char

2010 IBM Corporation

Information Management

Domains and Data Types

Data Type
This is counterpart of Domains in the Relational Model,
which defines the smallest unit of data that can be stored
Columns always have a data type

Attribute/Colum

A domain is the set of all


possible values for the
specific attribute

Smallest unit of
Data (atomic value)

Manager Domain = (Y, N)

16

2010 IBM Corporation

Information Management

Components of a Relational Database

17

Primary Keys
Uniquely identifies each tuple (row) of the relation (table)
Relations must always have a primary key
Although it is recommended, tables in a relational database
are not required to have a Primary Key

Examples
Driver's license of a person
ISBN of a book
Serial number of a product

Primary Key

2010 IBM Corporation

Information Management

Components of a Relational Database

18

Foreign Keys
Attribute in one relation whose values match a primary key
of another relation
Defines the relationship between two tables

ID
(PK)

NAME

DEPT_ID
(FK)

ID
(PK)

DEPT_NAME

John S

Finance

Michael B

Design

Jeremy W

Business

Leah E

Development

2010 IBM Corporation

Information Management

Why Normalization?

Why do we need normalization?

Consider the following table:


Lists of task an employee is involved in:

19

ID

Name

Office City

Extension

Task

John S

Toronto

54213

Planning

John S

Toronto

54213

Marketing

John S

Toronto

54213

Testing

Susan P

New York

59867

Marketing

Jennifer L

Chicago

59415

Planning

Jennifer L

Chicago

59415

Testing

Example operation: if John moves to a new city, all entries


related to John must be updated
redundancy
anomalies
2010 IBM Corporation

Information Management

Normalization

No anomalies, no redundancy

No loss of information

Employee Table

Task Table

ID

Name

Office City

Extension

John S

Toronto

54213

Susan P

New York

Jennifer L

Chicago

ID

Task

Planning

59867

Marketing

59415

Testing

Employee Tasks Table

20

Employee ID

Task ID

3
2010 IBM Corporation

Information Management Ecosystem Partnerships


IBM Canada Lab

Questions?

Summer/Fall 2010

E-mail: imschool@us.ibm.com
Subject: DB2 Academic Workshop

Information Management

2010 IBM Corporation

Potrebbero piacerti anche