Sei sulla pagina 1di 23

Unit – I Database System Concepts & Architecture

A Database Management System allows a person or user to organize, store, and retrieve data from a
computer. A database, as a collection of information, can be organized so a Database Management System
can access and pull specific information.
A DBMS is a software that allows creation, definition and manipulation of database, allowing users to store,
process and analyse data easily. DBMS provides us with an interface or a tool, to perform various operations
like creating database, storing data in it, updating data, creating tables in the database and a lot more. DBMS
also provides protection and security to the databases. It also maintains data consistency in case of multiple
users. Some examples of popular DBMS used these days:
 MySql
 Oracle
 SQL Server
 IBM DB2
 PostgreSQL
 Amazon SimpleDB (cloud based) etc.
Applications where we use Database Management Systems are:

 Telecom: There is a database to keeps track of the information regarding calls made, network
usage, customer details etc. Without the database systems it is hard to maintain that huge amount
of data that keeps updating every millisecond.
 Industry: Where it is a manufacturing unit, warehouse or distribution centre, each one needs a
database to keep the records of ins and outs. For example distribution centre should keep a track of
the product units that supplied into the centre as well as the products that got delivered out from the
distribution centre on each day; this is where DBMS comes into picture.
 Banking System: For storing customer info, tracking day to day credit and debit transactions,
generating bank statements etc. All this work has been done with the help of Database
management systems.
 Education sector: Database systems are frequently used in schools and colleges to store and
retrieve the data regarding student details, staff details, course details, exam details, payroll data,
attendance details, fees details etc. There is a hell lot amount of inter-related data that needs to be
stored and retrieved in an efficient manner.
 Online shopping: You must be aware of the online shopping websites such as Amazon, Flipkart
etc. These sites store the product information, your addresses and preferences, credit details and
provide you the relevant list of products based on your query. All this involves a Database
management system.

Characteristics of Database Management System


A database management system has following characteristics:
1. Data stored into Tables: Data is never directly stored into the database. Data is stored into
tables, created inside the database. DBMS also allows to have relationships between tables which
makes the data more meaningful and connected. You can easily understand what type of data is
stored where by looking at all the tables created in a database.
2. Reduced Redundancy: In the modern world hard drives are very cheap, but earlier when hard
drives were too expensive, unnecessary repetition of data in database was a big problem. But
DBMS follows Normalisation which divides the data in such a way that repetition is minimum.
3. Data Consistency: On Live data, i.e. data that is being continuosly updated and added,
maintaining the consistency of data can become a challenge. But DBMS handles it all by itself.
4. Support Multiple user and Concurrent Access: DBMS allows multiple users to work on
it(update, insert, delete data) at the same time and still manages to maintain the data consistency.
5. Query Language: DBMS provides users with a simple Query language, using which data can be
easily fetched, inserted, deleted and updated in a database.
6. Security: The DBMS also takes care of the security of data, protecting the data from un-
authorised access. In a typical DBMS, we can create user accounts with different access
permissions, using which we can easily secure our data by restricting user access.
7. DBMS supports transactions, which allows us to better handle and manage data integrity in real
world applications where multi-threading is extensively used.

Advantages of Database Management System (DBMS)


1. Improved data sharing
An advantage of the database management approach is, the DBMS helps to create an
environment in which end users have better access to more and better-managed data.
Such access makes it possible for end users to respond quickly to changes in their
environment.
2. Improved data security
The more users access the data, the greater the risks of data security breaches.
Corporations invest considerable amounts of time, effort, and money to ensure that
corporate data are used properly. A DBMS provides a framework for better enforcement of
data privacy and security policies.
3. Better data integration
Wider access to well-managed data promotes an integrated view of the organization’s
operations and a clearer view of the big picture. It becomes much easier to see how actions
in one segment of the company affect other segments.
4. Minimized data inconsistency
Data inconsistency exists when different versions of the same data appear in different
places. For example, data inconsistency exists when a company’s sales department stores
a sales representative’s name as “Bill Brown” and the company’s personnel department
stores that same person’s name as “William G. Brown,” or when the company’s regional
sales office shows the price of a product as $45.95 and its national sales office shows the
same product’s price as $43.95. The probability of data inconsistency is greatly reduced in a
properly designed database.
5. Improved data access
The DBMS makes it possible to produce quick answers to ad hoc queries. From a database
perspective, a query is a specific request issued to the DBMS for data manipulation—for
example, to read or update the data. Simply put, a query is a question, and an ad hoc query
is a spur-of-the-moment question. The DBMS sends back an answer (called the query
result set) to the application. For example, end users, when dealing with large amounts of
sales data, might want quick answers to questions (ad hoc queries) such as:
- What was the dollar volume of sales by product during the past six months?
- What is the sales bonus figure for each of our salespeople during the past three
months?
- How many of our customers have credit balances of 3,000 or more?
6. Improved decision making
Better-managed data and improved data access make it possible to generate better-quality
information, on which better decisions are based. The quality of the information generated
depends on the quality of the underlying data. Data quality is a comprehensive approach to
promoting the accuracy, validity, and timeliness of the data. While the DBMS does not
guarantee data quality, it provides a framework to facilitate data quality initiatives.
7. Increased end-user productivity
The availability of data, combined with the tools that transform data into usable information,
empowers end users to make quick, informed decisions that can make the difference
between success and failure in the global economy.
Disadvantages of Database Management System (DBMS):
1. Increased costs
one of the disadvantages of dbms is Database systems require sophisticated hardware and
software and highly skilled personnel. The cost of maintaining the hardware, software, and
personnel required to operate and manage a database system can be substantial. Training,
licensing, and regulation compliance costs are often overlooked when database systems
are implemented.
2. Management complexity
Database systems interface with many different technologies and have a significant impact
on a company’s resources and culture. The changes introduced by the adoption of a
database system must be properly managed to ensure that they help advance the
company’s objectives. Given the fact that database systems hold crucial company data that
are accessed from multiple sources, security issues must be assessed constantly.
3. Maintaining currency
To maximize the efficiency of the database system, you must keep your system current.
Therefore, you must perform frequent updates and apply the latest patches and security
measures to all components.Because database technology advances rapidly, personnel
training costs tend to be significant. Vendor dependence. Given the heavy investment in
technology and personnel training, companies might be reluctant to change database
vendors. As a consequence, vendors are less likely to offer pricing point advantages to
existing customers, and those customers might be limited in their choice of database
system components.

4. Frequent upgrade/replacement cycles


DBMS vendors frequently upgrade their products by adding new functionality. Such new
features often come bundled in new upgrade versions of the software. Some of these
versions require hardware upgrades. Not only do the upgrades themselves cost money, but
it also costs money to train database users and administrators to properly use and manage
the new features.

Components of DBMS:
The database management system can be divided into five major components, they are:
1. Hardware
2. Software
3. Data
4. Procedures
5. Database Access Language
Let's have a simple diagram to see how they all fit together to form a database management system.
 DBMS Components: Hardware
When we say Hardware, we mean computer, hard disks, I/O channels for data, and
any other physical component involved before any data is successfully stored into
the memory.
When we run Oracle or MySQL on our personal computer, then our computer's Hard
Disk, our Keyboard using which we type in all the commands, our computer's RAM,
ROM all become a part of the DBMS hardware.
 DBMS Components: Software
This is the main component, as this is the program which controls everything. The
DBMS software is more like a wrapper around the physical database, which provides
us with an easy-to-use interface to store, access and update data.
The DBMS software is capable of understanding the Database Access Language
and intrepret it into actual database commands to execute them on the DB.
 DBMS Components: Data
Data is that resource, for which DBMS was designed. The motive behind the
creation of DBMS was to store and utilise data.
In a typical Database, the user saved Data is present and meta data is stored.
Metadata is data about the data. This is information stored by the DBMS to better
understand the data stored in it.
For example: When I store my Name in a database, the DBMS will store when the
name was stored in the database, what is the size of the name, is it stored as related
data to some other data, or is it independent, all this information is metadata.
 DBMS Components: Procedures
Procedures refer to general instructions to use a database management system.
This includes procedures to setup and install a DBMS, To login and logout of DBMS
software, to manage databases, to take backups, generating reports etc.
 DBMS Components: Database Access Language
Database Access Language is a simple language designed to write commands to
access, insert, update and delete data stored in any database.
A user can write commands in the Database Access Language and submit it to the
DBMS for execution, which is then translated and executed by the DBMS.
User can create new databases, tables, insert data, fetch stored data, update data
and delete the data using the access language.

Codd's Rule for Relational DBMS


E.F Codd was a Computer Scientist who invented the Relational model for Database
management. Based on relational model, theRelational database was created. Codd proposed
some rules popularly known as Codd's 12 rules to test DBMS's concept against his relational
model. Codd's rule actualy define what quality a DBMS requires in order to become a Relational
Database Management System(RDBMS).

Rule 1: Information rule


All information(including metadata) is to be represented as stored data in cells of tables.
The rows and columns have to be strictly unordered.

Rule 2: Guaranted Access


Each unique piece of data(atomic value) should be accesible by : Table Name + Primary
Key(Row) + Attribute(column).
NOTE: Ability to directly access via POINTER is a violation of this rule.
Rule 3: Systematic treatment of NULL
Null has several meanings; it can mean missing data, not applicable or no value. It should
be handled consistently. Also, Primary key must not be null, ever. Expression
on NULL must give null.

Rule 4: Active Online Catalog


Database dictionary (catalog) is the structure description of the complete Database and it
must be stored online. The Catalog must be governed by same rules as rest of the
database. The same query language should be used on catalog as used to query database.

Rule 5: Powerful and Well-Structured Language


One well structured language must be there to provide all manners of access to the data
stored in the database. Example: SQL, etc. If the database allows access to the data
without the use of this language, then that is a violation.

Rule 6: View Updation Rule


All the view that are theoretically updatable should be updatable by the system as well.

Rule 7: Relational Level Operation


There must be Insert, Delete, Update operations at each level of relations. Set operation
like Union, Intersection and minus should also be supported.

Rule 8: Physical Data Independence


The physical storage of data should not matter to the system. If say, some file supporting
table is renamed or moved from one disk to another, it should not effect the application.

Rule 9: Logical Data Independence


If there is change in the logical structure(table structures) of the database the user view of
data should not change. Say, if a table is split into two tables, a new view should give result
as the join of the two tables. This rule is most difficult to satisfy.

Rule 10: Integrity Independence


The database should be able to enforce its own integrity rather than using other programs.
Key and Check constraints, trigger etc, should be stored in Data Dictionary. This also
make RDBMS independent of front-end.

Rule 11: Distribution Independence


A database should work properly regardless of its distribution across a network. Even if a
database is geographically distributed, with data stored in pieces, the end user should get
an impression that it is stored at the same place. This lays the foundation of distributed
database.

Rule 12: Nonsubversion Rule


If low level access is allowed to a system it should not be able to subvert or bypass integrity
rules to change the data. This can be achieved by some sort of looking or encryption.
File System v/s DBMS
Parameters DBMS File System
Meaning DBMS is a collection of data and set of programs to access File system is an abstraction to store, retrieve and
those data. update a set of files.
Data Repetition of data can be reduced. Repetition of data cannot be reduced.
Redundancy
Data Inconsistency of data can be avoided. Inconsistency of data cannot be avoided.
Inconsistency
Data Inconsistency of data can be avoided. Inconsistency of data cannot be avoided.
Inconsistency
Sharing of Data can be shared easily. Data cannot be shared easily. There is difficulty in
Data accessing the data.
Data Integrity The data can be accessed from the database through Data independence cannot be provided to access the
physically as well as logically. data.
Response to It gives fast response to information request because the It gives slow response to information request.
Request data are integrated into a single database.
Security It restricts unauthorized access. Lack of security.
Concurrency Data can be accessed by multiple users at same point of Problem in concurrency control.
Control time.
Data Recovery It provides back up of data and restores the database to It does not provide back up and data recovery. The
its original state after database failure. data may loss if the operation is failed.

Abstraction is one of the main features of database systems. Hiding irrelevant details from user and
providing abstract view of data to users, helps in easy and efficient user-database interaction.
To understand the view of data, you must have a basic knowledge of data abstraction and instance &
schema. Refer these two tutorials to learn them in detail.
1. Instance and schema
2. Data abstraction

DBMS Architecture contains Schemas, Sub-schemas and Instances


 Schemas: - The overall logical design of the database.
 Sub-Schemas:-A sub schema is a subset of the schema and inherits the same property
that a schema has. The plan (or scheme) for a view is often called sub-
schemas.
 Instances: - The collection of information stored in the database at a particular moment.

The difference between Schema and Instance are given in below table:
Schema Instance

Definition The overall logical design of the database. The collection of information stored in the
database at a particular moment.
Includes Table names,column names, datatypes, and size of Actual data or information stored in tables
columns, various constraint at logical level. in form of different records.
Change changes infrequently. changes frequently.
Cause of insertion of tables or columns and change in insert,delete or update operation on data
change datatype,size or constraints on any column. stored in database.
Analogy variable declaration value of the variable
A database schema can be divided broadly into two categories –

 Physical Database Schema − this schema pertains to the actual storage of data and its form of
storage like files, indices, etc. It defines how the data will be stored in a secondary storage.
 Logical Database Schema − this schema defines all the logical constraints that need to be
applied on the data stored. It defines tables, views, and integrity constraints.

Three-level Database Architecture


The Architecture of most of commercial dbms are available today is mostly based on this ANSI-
SPARC database architecture.
THREE-TIER architecture has main three levels:
1. Internal Level
2. Conceptual Level
3. External Level

These three levels provide data abstraction which means hide the low level complexities from end
users. A database system should be efficient in performance and convenient in use. Using these
three levels, it is possible to use complex structures at internal level for efficient operations and to
provide simpler convenient interface at external level.

1. Internal level:
 This is the lowest level of data abstraction.
 It describes how the data are actually stored on storage devices.
 It is also known as physical level.
 It provides internal view of physical storage of data.
 It deals with complex low level data structures, file structures and access methods in detail.
 It also deals with Data Compression and Encryption techniques, if used.
2. Conceptual level:
 This is the next higher level than internal level of data abstraction.
 It describes what data are stored in the database and what relationships exist among those
data.
 It is also known as logical level.
 It hides low level complexities of physical storage.
 Database administrator and designers work at this level to determine what data to keep in
database.
 Application developers also work on this level.
3. External Level:
 This is the highest level of data abstraction.
 It describes only part of the entire database that a end user concern.
 It is also known as a view level.
 End users need to access only part of the database rather than entire database.
 Different user needs different views of database and so, there can be many view level
abstractions of the same database.

We have namely two levels of data independence arising from these levels of abstraction:
1. Physical level data independence:
It refers to the characteristic of being able to modify the physical schema without any
alterations to the conceptual or logical schema, done for optimization purposes, e.g.,
Conceptual structure of the database would not be affected by any change in storage size of
the database system server. Changing from sequential to random access files is one such
example. These alterations or modifications to the physical structure may include:
 Utilizing new storage devices.
 Modifying data structures used for storage.
 Altering indexes or using alternative file organization techniques etc.
2. Logical level data independence:
It refers characteristic of being able to modify the logical schema without affecting the
external schema or application program. The user view of the data would not be affected by
any changes to the conceptual view of the data. These changes may include insertion or
deletion of attributes, altering table structures entities or relationships to the logical schema
etc.

Mappings
Process of transforming request and results between three levels it's called Mapping. There are the two
types of mappings:
 Conceptual/Internal Mapping
 External/Conceptual Mapping
1. Conceptual/Internal Mapping:

 The conceptual/internal mapping defines the correspondence between the conceptual view and
the store database.
 It specifies how conceptual record and fields are represented at the internal level.
 It relates conceptual schema with internal schema.
 If structure of the store database is changed.
 If changed is made to the storage structure definition-then the conceptual/internal mapping must
be changed accordingly, so that the conceptual schema can remain invariant.
 There could be one mapping between conceptual and internal levels.

2. External/Conceptual Mapping:

 The external/conceptual mapping defines the correspondence between a particular external view
and conceptual view.
 It relates each external schema with conceptual schema.
 The differences that can exist between these two levels are analogous to those that can exist
between the conceptual view and the stored database.
 Example: fields can have different data types; fields and record name can be changed; several
conceptual fields can be combined into a single external field.
 Any number of external views can exist at the same time; any number of users can share a given
external view: different external views can overlap.
 There could be several mapping between external and conceptual levels.

Structure of DBMS:
 DBMS (Database Management System) acts as an interface between the user and the database.
The user requests the DBMS to perform various operations such as insert, delete, update and
retrieval on the database.
 The components of DBMS perform these requested operations on the database and provide
necessary data to the users.

Components of a DBMS
The components of DBMS can be divided into two parts:
 DDL Compiler:

Data Description Language compiler processes schema definitions specified in the DDL.

It includes metadata information such as the name of the files, data items, storage
details of each file, mapping information and constraints etc.
 DML Compiler and Query optimizer:
 The DML commands such as insert, update, delete, retrieve from the application
program are sent to the DML compiler for compilation into object code for database
access.
 The object code is then optimized in the best way to execute a query by the query
optimizer and then send to the data manager.
 Data Manager:
 The Data Manager is the central software component of the DBMS also knows as
Database Control System.
 The Main Functions Of Data Manager Are:
o Convert operations in user's Queries coming from the application programs or
combination of DML Compiler and Query optimizer which is known as Query
Processor from user's logical view to physical file system.
o Controls DBMS information access that is stored on disk.
o It also controls handling buffers in main memory.
o It also enforces constraints to maintain consistency and integrity of the data.
o It also synchronizes the simultaneous operations performed by the concurrent
users.
o It also controls the backup and recovery operations.
 Data Dictionary:
 Data Dictionary, which stores metadata about the database, in particular the schema of
the database.
 Names of the tables, names of attributes of each table, length of attributes, and number
of rows in each table.
 Detailed information on physical database design such as storage structure, access
paths, files and record sizes.
 Usage statistics such as frequency of query and transactions.
 Data dictionary is used to actually control the data integrity, database operation and
accuracy. It may be used as a important part of the DBMS
 Data Files:
 Which store the database itself.
 Indices:
 A database index is a data structure that improves the speed of data retrieval
operations on a database table at the cost of additional writes and storage space to
maintain the index data structure.
 Indexes are used to quickly locate data without having to search every row in a
database table every time a database table is accessed.
 Indexes can be created using one or more columns of a database table, providing the
basis for both rapid random lookups and efficient access of ordered records.`
 Compiled DML:-
 The DML complier converts the high level Queries into low level file access commands
known as compiled DML.
 Database Users: - Database users are the one who really use and take the benefits of
database. There will be different types of users depending on their need and way of accessing
the database.
 Sophisticated Users - They are database developers, who write SQL queries to
select/insert/delete/update data. They do not use any application or programs to request
the database. They directly interact with the database by means of query language like
SQL. These users will be scientists, engineers, analysts who thoroughly study SQL and
DBMS to apply the concepts in their requirement. In short, we can say this category
includes designers and developers of DBMS and SQL.
 Specialized Users - These are also sophisticated users, but they write special database
application programs. They are the developers who develop the complex programs to
the requirement.
 Naive Users - these are the users who use the existing application to interact with the
database. For example, online library system, ticket booking systems, ATMs etc which
has existing application and users use them to interact with the database to fulfill their
requests.
 Database Administration - Person in the organization who controls the design and the
use of the database refers as DBA.
o Schema Definition:
 The DBA definition the logical Schema of the database.A Schema refers
to the overall logical structure of the database.
 According to this schema, database will be developed to store required
data for an organization.
o Storage Structure and Access Method Definition:
 The DBA decides how the data is to be represented in the stored
database.
o Assisting Application Programmers:
 The DBA provides assistance to application programmers to develop
application programs.
o Physical Organization Modification:
 The DBA modifies the physical organization of the database to reflect the
changing needs of the organization or to improve performance.
o Approving Data Access:
 The DBA determines which user needs access to which part of the
database.
 According to this, various types of authorizations are granted to different
users.
o Monitoring Performance:
 The DBA monitors performance of the system. The DBA ensures that
better performance is maintained by making changes in physical or logical
schema if required.
o Backup and Recovery:
 Database should not be lost or damaged.
 The DBA ensures this periodically backing up the database on magnetic
tapes or remote servers.
 In case of failure, such as virus attack database is recovered from this
backup.

 Query Processor Units: - Executes low level instructions generated by DML compiler.
o DDL Interpreter
o DML Compiler
o Embedded DML Pre-compiler
o Query Evaluation Engine
 Storage Manager Units
o Checks the authority of users to access data.
o Checks for the satisfaction of the integrity constraints.
o Preserves atomicity and controls concurrency.
o Manages allocation of space on disk.
o Fetches data from disk storage to memory for being used:
 Authorization Manager
 Integrity Manager
 Transaction Manager
 File manager
 Buffer Manager
Functions of DBMS:
o DBMS free the programmers from the need to worry about the organization and
location of the data i.e. it shields the users from complex hardware level details.
o DBMS can organize process and present data elements from the database. This
capability enables decision makers to search and query database contents in order
to extract answers that are not available in regular Reports.
o Programming is speeded up because programmer can concentrate on logic of the
application.
o It includes special user friendly query languages which are easy to understand by
non programming users of the system.
The service provided by the DBMS includes :-
o Authorization services like log on to the DBMS start the database stop the Database
etc.
o Transaction supports like Recovery, Rollback etc,
o Import and Export of Data.
o Maintaining data dictionary
o User's Monitoring.

Database Languages:-
Database Languages are used to create and maintain database on computer. Database languages
are used for read, update and store data in a database. There are several such languages that can
be used for this purpose; one of them is SQL (Structured Query Language).The four main
categories of SQL statements are as follows:
1. DML (Data Manipulation Language)
2. DDL (Data Definition Language)
3. DCL (Data Control Language)
4. TCL (Transaction Control Language)
 Data Definition Language (DDL): DDL is used for specifying the database schema. Let’s take
SQL for instance to categorize the statements that comes under DDL.
 To create the database instance – CREATE
 To alter the structure of database – ALTER
 To drop database instances – DROP
 To delete tables in a database instance – TRUNCATE
 To rename database instances – RENAME

 Data Manipulation Language (DML): DML is used for accessing and manipulating data in a
database.
 To read records from table(s) – SELECT
 To insert record(s) into the table(s) – INSERT
 Update the data in table(s) – UPDATE
 Delete all the records from the table – DELETE
 Data Control language (DCL): DCL is used for granting and revoking user access on a database

 To grant access to user – GRANT
 To revoke access from user – REVOKE
 TCL (Transaction Control Language): TCL statements allow you to control and manage
transactions to maintain the integrity of data within SQL statements.
 To opens a transaction – BEGIN
 To commits a transaction – COMMIT
 To ROLLBACK a transaction in case of any error - ROLLBACK
Data Administrator and Database administrator:-
DA (Data Administrator) and DBA (Database Administrator) both are responsible for managing
database for an organization. They differ from each other in their required skills and
responsibilities.

Data Administrator (DA):


 "Person in the organization who controls the data of the database refers data administrator."
 DA determines what data to be stored in database based on requirement of the organization.
 DA works on such as requirements gathering, analysis, and design phases.
 DA does not to be a technical person, any kind of knowledge about database technology can be
more beneficiary
 DA is some senior level person in the organization. in short, DA is a business focused person but
should understand about the database technology.

Database Administrator (DBA):


 "Person in the organization who controls the design and the use of the database refers database
administrator."
 DBA provides necessary technical support for implementing a database.
 DBA works on such as design, development, testing, and operational phases.
 DBA is a technical person having knowledge of database technology.
 DBA does not need to be a business person. in short, DBA is a technically focused person but
should understand about the business to administrator the database effectively.

DBA Responsibilities:
The following sections examine the responsibilities of the database administrator and how they
translate to various Microsoft SQL Server tasks.
 Installing and Upgrading an SQL Server
The DBA is responsible for installing SQL Server or upgrading an existing SQL Server. In
the case of upgrading SQL Server, the DBA is responsible for ensuring that if the upgrade is
not successful, the SQL Server can be rolled back to an earlier release until the upgrade
issues can be resolved.
The DBA is also responsible for applying SQL Server service packs. A service pack is not a
true upgrade, but an installation of the current version of software with various bug fixes and
patches that have been resolved since the product's release.
 Monitoring the Database Server's Health and Tuning Accordingly
Monitoring the health of the database server means making sure that the following is done:
 The server is running with optimal performance.
 The error log or event log is monitored for database errors.
 Databases have routine maintenance performed on them, and the overall system has
periodic maintenance performed by the system administrator.

 Using Storage Properly


SQL Server 2000 enables you to automatically grow the size of your databases and
transaction logs, or you can choose to select a fixed size for the database and transaction
log. Either way, maintaining the proper use of storage means monitoring space
requirements and adding new storage space (disk drives) when required.
 Performing Backup and Recovery Duties
Backup and recovery are the DBA's most critical tasks; they include the following aspects:
 Establishing standards and schedules for database backups
 Developing recovery procedures for each database
 Making sure that the backup schedules meet the recovery requirements
 Managing Database Users and Security
With SQL Server 2000, the DBA works tightly with the Windows NT administrator to add
user NT logins to the database. In non-NT domains, the DBA adds user logins. The DBA is
also responsible for assigning users to databases and determining the proper security level
for each user. Within each database, the DBA is responsible for assigning permissions to
the various database objects such as tables, views, and stored procedures.
 Working with Developers
It is important for the DBA to work closely with development teams to assist in overall
database design, such as creating normalized databases, helping developers tune queries,
assigning proper indexes, and aiding developers in the creation of triggers and stored
procedures.
In the SQL Server 2000 environment, a good DBA will show the developers how to use and
take advantage of the SQL Server Index Tuning Wizard and the SQL Server profiler.
 Establishing and Enforcing Standards
The DBA should establish naming conventions and standards for the SQL Server and
databases and make sure that everyone sticks to them.
 Transferring Data
The DBA is responsible for importing and exporting data to and from the SQL Server. In the
current trend to downsize and combine client/server systems with mainframesystems and
Web technologies to create Enterprise systems, importing data from the mainframe to SQL
Server is a common occurrence that is about to become more common with the SQL Server
2000 Data Transformation Services. Good DTS DBAs will be in hot demand as companies
struggle to move and translate legacy system to Enterprise systems.
 Replicating Data
SQL Server version 2000 has many different replication capabilities such as Merge
replication (2-way disconnected replication) and queued replication. Managing and setting
up replication topologies is a big undertaking for a DBA because of the complexities
involved with properly setting up and maintaining replication.
 Data Warehousing
SQL Server 2000 has substantial data warehousing capabilities that require the DBA to
learn an additional product (the Microsoft OLAP Server) and architecture. Data warehousing
provides new and interesting challenges to the DBA and in some companies a new career
as a warehouse specialist.
 Scheduling Events
The database administrator is responsible for setting up and scheduling various events
using Windows NT and SQL Server to aid in performing many tasks such as backups and
replication.
 Providing 24-Hour Access
The database server must stay up, and the databases must always be protected and online.
Be prepared to perform some maintenance and upgrades after hours. Also be prepared to
carry that dreaded beeper. If the database server should go down, be ready to get the
server up and running. After all, that's your job.
 Learning Constantly
To be a good DBA, you must continue to study and practice your mission-critical
procedures, such as testing your backups by recovering to a test database. In this business,
technology changes very fast, so you must continue learning about SQL Server, available
client/servers, and database design tools. It is a never-ending process.

DBA should posses the following skills


(1) A good knowledge of the operating system(s)
(2) A good knowledge of physical database design
(3) Ability to perform both Oracle and also operating system performance monitoring and the
necessary adjustments.
(4) Be able to provide a strategic database direction for the organization.
(5) Excellent knowledge of Oracle backup and recovery scenarios.
(6) Good skills in all Oracle tools.
(7) A good knowledge of Oracle security management.
(8) A good knowledge of how Oracle acquires and manages resources.
(9) Sound knowledge of the applications at your site.
(10) A DBA should have sound communication skills with management, development teams,
vendors, systems administrators and other related service providers.

Data Models
 “A data model is a way of finding the tools for both business and IT professionals, which uses a set
of symbols and text to precisely explain a subset of real information to improve communication
within the organization and thereby lead to a more flexible and stable application environment”.
 A data model is an idea which describes how the data can be represented and accessed from
software system after its complete implementation.
 It is a simple abstraction of complex real world data gathering environment.
 It defines data elements and relationships among various data elements for a specified system.
 The main purpose of data model is to give an idea that how final system or software will look like
after development is completed.
 Depending on the levels of data we are modeling, we have divided data models into 3 categories –
Object Based, Physical and Record based Data models.

Object based Data Models


 It is designed using the entities in the real world, attributes of each entity and their
relationship. It picks up each thing/object in the real world which is involved in the
requirement.
 Example - Imagine we have to design database for college. What is the real world entities
involved with college? They are college, Students, Lecturer, Course, Subject, Marks etc.
Once all the entities are listed, we find out the relationship between them and try to map all
of them. Also we list what are the attributes related to each entity like student id, name,
lecturer name, course that he is teaching, different subjects, pass mark, grade levels etc.
Here we are not bothered about what data value is stored, what is the size of each data etc.
We know only entities involved, their attributes and mapping at this stage.
 There are two types of object based data Models – Entity Relationship Model and Object
oriented data model.

 Entity Relationship Data Models


 ER model represents the all these entities, attributes and their relationship in the form of picture
to make the developer understand the system better. It also list attributes of each objects. In
the below diagram, Entities or real world objects are represented in a rectangular box. Their
attributes are represented in ovals. Primary keys of entities are underlined. All the entities are
mapped using diamonds. This is one of the methods of representing ER model. There are
many different forms of representation. More details of this model are described in ER data
model article.

 ER model is a graphical representation of real world objects with their attributes and
relationship. It makes the system easily understandable. This model is considered as a top
down approach of designing a requirement.
 Advantages
 It makes the requirement simple and easily understandable by representing simple diagrams.
 One can covert ER diagrams into record based data model easily.
 Easy to understand ER diagrams
 Disadvantages
 No standard notations are available for ER diagram. There is great flexibility in the notation. It’s all
depends upon the designer, how he draws it.
 It is meant for high level designs. We cannot simplify for low level design like coding.

 Object Oriented Data Models


 This data model is another method of representing real world objects. It considers each object
in the world as objects and isolates it from each other. It groups its related functionalities
together and allows inheriting its functionality to other related sub-groups.
 Example - Let us consider an Employee database to understand this model better. In this
database we have different types of employees – Engineer, Accountant, Manager, Clark. But
all these employees belong to Person group. Person can have different attributes like name,
address, age and phone. What do we do if we want to get a person’s address and phone
number? We write two separate procedure sp_getAddress and sp_getPhone.

What about all the employees above? They too have all the attributes what a person has. In
addition, they have their EMPLOYEE_ID, EMPLOYEE_TYPE and DEPARTMENT_ID attributes
to identify them in the organization and their department. We have to retrieve their department
details, and hence we sp_getDeptDetails procedure. Currently, say we need to have only these
attributes and functionality.

Since all employees inherit the attributes and functionalities of Person, we can re-use those
features in Employee. But do we do that? We group the features of person together into class.
Hence a class has all the attributes and functionalities. For example, we would create a person
class and it will have name, address, age and phone as its attribute, and sp_getAddress and
sp_getPhone as procedures in it. The values for these attributes at any instance of time are
object. i.e. ; {John, Troy, 25, 2453545 : sp_getAddress (John), sp_getPhone (John)} forms on
person object. {Mathew, Fraser Town, 28, 5645677: sp_getAddress (Mathew), sp_getPhone
(Mathew} forms another person object.

Now, we will create another class called Employee which will inherit all the functionalities of
Person class. In addition it will have attributes EMPLOYEE_ID, EMPLOYEE_TYPE and
DEPARTMENT_ID, and sp_getDeptDetails procedure. Different objects of Employee class are
Engineer, Accountant, Manager and Clerk.

 Here we can observe that the features of Person are available only if other class is inherited
from it. It would be a black box to any other classes. This feature of this model is called
encapsulation. It binds the features in one class and hides it from other classes. It is only visible
to its objects and any inherited classes.
 Advantages
 Because of its inheritance property, we can re-use the attributes and functionalities. It reduces the
cost of maintaining the same data multiple times. Also, these informations are encapsulated and,
there is no fear being misused by other objects. If we need any new feature we can easily add
new class inherited from parent class and adds new features. Hence it reduces the overhead
and maintenance costs.
 Because of the above feature, it becomes more flexible in the case of any changes.
 Codes are re-used because of inheritance.
 Since each class binds its attributes and its functionality, it is same as representing the real world
object. We can see each object as a real entity. Hence it is more understandable.
 Disadvantages
 It is not widely developed and complete to use it in the database systems. Hence it is not accepted
by the users.
 It is an approach for solving the requirement. It is not a technology. Hence it fails to put it in the
database management systems.
Physical based Data Models
 Physical data model represent the model where it describes how data are stored in
computer memory, how they are scattered and ordered in the memory, and how they would
be retrieved from memory. Basically physical data model represents the data at data layer
or internal layer. It represents each table, their columns and specifications, constraints like
primary key, foreign key etc. It basically represents how each tables are built and related to
each other in DB.

 Above diagram shows how physical data model is designed. It is represented as UML
diagram along with table and its columns. Primary key is represented at the top. The
relationship between the tables is represented by interconnected arrows from table to table.
Above STUDENT table is related to CLASS and SUBJECT is related to CLASS. The above
diagram depicts CLASS as the parent table and it has 2 child tables – STUDENT and
SUBJECT.
 In short we can say a physical data model has
o Tables and its specifications – table names and their columns. Columns are
represented along with their datatypes and size. In addition primary key of each
table is shown at the top of the column list.
o Foreign keys are used to represent the relationship between the tables. Mapping
between the tables are represented using arrows between them.
o Physical data model can have demoralized structure based on the user requirement.
The tables might not be in normalized forms.
 Physical data model is dependent on the RDBMS i.e.; it varies based on the RDBMS used.
This means data type notation varies depending on the RDBMS. For example, we have
different datatypes in SQL server and oracle server. In addition, the representation of
physical data model diagram may be different, though it contains same information as
described above – some may represent primary key and foreign keys separately at the end
of the column list. This data model depends on the user / designer how he specifies the
diagram and the RDBMS servers. Below diagram shows different ways of representing a
table.
 Hence object based data model is based on the real requirement from the user, whereas
record based data model is based on the actual relationships and data in DB. The Physical
data model is based on the table structure in the DB.

Record based Data Models


 These data models are based on application and user levels of data. They are modeled
considering the logical structure of the objects in the database. This data models defines the
actual relationship between the data in the entities.
 For example, employee and department entities are related to each other by means of
department. This minute level of relationship is defined in the record based data models. We
will not be able get the mapping at this level in the object based data models. There it simply
defines two entities are related. But the actual relationship between any two entities can be
observed in record based data models.
 There are 3 types of record based data models defined so far- Hierarchical, Network and
Relational data models.
 Most widely used record based data model is relational data model. Other two are not widely
used.

 Hierarchical Data Models


 Imagine we have to create a database for a company. What are the entities involved in it?
Company, its department, its supplier, its employees, different projects of the company etc are
the different entities we need to take care of. If we observe each of the entity they have parent
–child relationship. We can design them like we do ancestral hierarchy. In our case, Company
is the parent and rests of them are its children. Department has employees and project as its
children and so on. This type of data modeling is called hierarchical data model.
 In this data model, the entities are represented in a hierarchical fashion. Here we identify a
parent entity, and its child entity. Again we drill down to identify next level of child entity and so
on. This model can be imagined as folders inside a folder!
 In our example above, it is diagrammatically represented as below:

 It can also be imagined as root like structure. This model will have only one main root. It then
branches into sub-roots, each of which will branch again. This type of relationship is best
defined for 1:N type of relationships. E.g.; One company has multiple departments (1:N), one
company has multiple suppliers (1:N),one department has multiple employees (1:N), each
department has multiple projects(1:N) . If we have M:N relationships, then we have to duplicate
the entities and show it in the diagram. For example, if a project in the company involves
multiple departments, then our hierarchical representation changes as below:
 Advantages
 It helps to address the issues of flat file data storage. In flat files, data will be scattered and
there will not be any proper structuring of the data. This model groups the related data into
tables and defines the relationship between the tables, which is not addressed in flat files.
 Disadvantages
 Redundancy: - When data is stored in a flat file, there might be repetition of same data multiple
times and any changes required for the data will need to change in all the places in the flat file.
Missing to update at any one place will cause incorrect data. This kind redundancy is solved by
hierarchical model to some extent. Since records are grouped under related table, it solves the
flat file redundancy issue. But look at the many to many relationship examples given above. In
such case, we have to store same project information for more than one department. This is
duplication of data and hence a redundancy. So, this model does not reduce the redundancy
issue to a significant level.
 As we have seen above, it fails to handle many to many relationships efficiently. It results in
redundancy and confusion. It can handle only parent-child kind of relationship.
 If we need to fetch any data in this model, we have to start from the root of the model and
traverse through its child till we get the result. In order to perform the traversing, either we
should know well in advance the layout of model or we should be very good programmer.
Hence fetching through this model becomes bit difficult.
 Imagine company has got some new project details, but it did not assign it to any department
yet. In this case, we cannot store project information in the PROJECT table, till company
assigns it to some department. That means, in order to enter any child information, its parent
information should be already known / entered.

 Network Data Models


 This is the enhanced version of hierarchical data model. It is designed to address the
drawbacks of the hierarchical model. It helps to address M:N relationship. This data model is
also represented as hierarchical, but this model will not have single parent concept. Any child in
the tree can have multiple parents here.
 Let us revisit our company example. A company has different projects and departments in the
company own those projects. Even suppliers of the company give input for the project. Here
Project has multiple parents and each department and supplier have multiple projects. This is
represented as shown below. Basically, it forms a network like structure between the entities,
hence the name.
 Advantages
 Accessing the records in the tables is easy since it addresses many to many relationships.
Because of this kind of relationship, any records can be easily pulled by using any tables. For
example, if we want to know the department of project X and if we know SUPPLIER table, we
can pull this information. i.e.; SUPPLIER has the information about project X which includes the
departments involved in the projects too. Hence makes the accessibility to any data easier, and
even any complex data can be retrieved easily and quickly.
 Because of the same feature above, one can easily navigate among the tables and get any data.
 It is designed based on database standards – ANSI/SP ARC.
 Disadvantages
 If there is any requirement for the changes to the entities, it requires entire changes to the
database. There is no independence between any objects. Hence any changes to the any of the
object will need changes to the whole model. Hence difficult to manage.
 It would be little difficult to design the relationship between the entities, since all the entities are
related in some way. It requires thorough practice and knowledge about the designing.

 Relational Data Models


 This model is designed to overcome the drawbacks of hierarchical and network models. It is
designed completely different from those two models. Those models define how they are
structured in the database physically and how they are inter-related. But in the relational model,
we are least bothered about how they are structured. It purely based on how the records in
each table are related. It purely isolates physical structure from the logical structure. Logical
structure is defines records are grouped and distributed.
 Let us try to understand it by an example. Let us consider department and employee from our
previous examples above. In this model we look at employee with its data. When we say an
employee what all comes into our mind? His employee id, name, address, age, salary,
department that he is working etc. are attributes of employee. That means these details about
the employee forms columns in employee table and value set of each employee for these
attribute forms a row/record for an employee. Similarly, department has its id, name.
 Now, in the employee table, we have column which uniquely identifies each employee – that is
employee Id column. This column has unique value and we are able to differentiate each
employee from each other by using this column. Such column is called as primary key of the
table. Similarly department table has DEPT_ID as primary key. In the employee table, instead
of storing whole information about his department, we have DEPT_ID from department table
stored. i.e.; by using the data from the department table, we have established the relation
between employee and department tables.

 Observe the table structures above. They are very simple to understand. There is no redundant
data as well. It addressed major drawback of earlier data models. This type of data model is
called relational data model.
 This model is based on the mathematical concepts of set theory. It considers the tables as a
two dimensional table with rows and columns. It is least bothered about the physical storage of
structure and data in the memory. It considers only the data and how it can be represented in
the form of rows and columns, and the way it can establish the relation between other tables.
 A relational data model revolves around 5 important rules.
o Order of rows / records in the table is not important. For example, displaying the records
for Joseph is independent of displaying the records for Rose or Mathew in Employee
table. It does not change the meaning or level of them. Each record in the table is
independent of other. Similarly, order of columns in the table is not important. That
means, the value in each column for a record is independent of other. For example,
representing DEPT_ID at the end or at the beginning in the employee table does not
have any affect.
o Each record in the table is unique. That is there is no duplicate record exists in the table.
This is achieved by the use of primary key or unique constraint.
o Each column/attribute will have single value in a row. For example, in Department table,
DEPT_NAME column cannot have ‘Accounting’ and ‘Quality’ together in a single cell.
Both has to be in two different rows as shown above.
o All attributes should be from same domain. That means each column should have
meaningful value. For example, Age column cannot have dates in it. It should contain
only valid numbers to represent individual’s age. Similarly, name columns should have
valid names, Date columns should have proper dates.
o Table names in the database should be unique. In the database, same schema cannot
contain two or more tables with same name. But two tables with different names can
have same column names. But same column name is not allowed in the same table.
 Examine below table structure for Employee, Department and Project and see if it satifies
relational data model rules.

 Advantages
 Structural independence:- Any changes to the database structure, does not the way we are
accessing the data. For example, Age is added to Employee table. But it does not change the
relationship between the other tables nor changes the existing data. Hence it provides the total
independence from its structure.
 Simplicity:- This model is designed based on the logical data. It does not consider how data are
stored physically in the memory. Hence when the designer designs the database, he
concentrates on how he sees the data. This reduces the burden on the designer.
 Because of simplicity and data independence, this kind of data model is easy to maintain and
access.
 This model supports structured query language – SQL. Hence it helps the user to retrieve and
modify the data in the database. By the use of SQL, user can get any specific information from
the database.
 Disadvantages
Compared to the advantages above, the disadvantages of this model can be ignored.
 High hardware cost:- In order to separate the physical data information from the logical data,
more powerful system hardwares – memory is required. This makes the cost of database high.
 Sometimes, design will be designed till the minute level, which will lead to complexity in the
database.
Comparison among the three Record Based Data Models

Hierarchical Data Model Network Data Models Relational Data Models


Supports One-Many Supports both one to many Supports both one to many
Relationship and Many to Many and Many to Many
relationship relationship
Because of single parent- It establishes the relationship It provides SQL, which
child relationship, difficult to between most of the objects, makes the access to the
navigate through the child hence easy to access data simpler and quicker.
compared to hierarchical
model
Flexibility among the Because of the mapping Primary and foreign key
different object is restricted among the sub level tables, constraint makes the
to the child. flexibility is more flexibility much simpler than
other models.
Based on the physical Based on the physical Based on the logical data
storage details storage details view

Potrebbero piacerti anche