Rdbms (Unit 3)

Unit 3 : Database System Architectures
Centralized Systems
■ Run on a single computer system and do not interact with other
computer systems.
■ Generalpurpose computer system: one to a few CPUs and a number
of device controllers that are connected through a common bus that
provides access to shared memory.
■ Singleuser system (e.g., personal computer or workstation): desktop
unit, single user, usually has only one CPU and one or two hard disks;
the OS may support only one user.
■ Multiuser system: more disks, more memory, multiple CPUs, and a
multiuser OS. Serve a large number of users who are connected to
the system via terminals.
A Centralized Computer System
Centralized Systems
■ Coarse – granularity parallelism
● Computer with only few processors, all sharing the main memory.
● Database running on such machines do not partition a single query
among the processors.
● Run each query on a single processor.
■ Fine – granularity parallelism
● Computer with large number of processors.
● Database running on such machines attempt to parallelize single tasks.
● Partition a single query among the processors.
ClientServer Systems
■ Server systems satisfy requests generated at m client systems, whose general
structure is shown below:
ClientServer Systems (Cont.)
■ Database functionality can be divided into:
● Backend: manages access structures, query evaluation and
optimization, concurrency control and recovery.
● Frontend: consists of tools such as forms, reportwriters, and
graphical user interface facilities.
■ The interface between the frontend and the backend is through SQL or
through an application program interface.
Server System Architecture
■ Server systems can be broadly categorized into two kinds:
● Transaction servers, which are widely used in relational
database systems.
● Data servers, used in objectoriented database systems.
Transaction Servers
■ Also called query server systems or SQL server systems
● Clients send requests to the server
● Transactions are executed at the server
● Results are shipped back to the client.
■ Open Database Connectivity (ODBC) is a C language application
program interface standard from Microsoft for connecting to a server,
sending SQL requests, and receiving results.
■ JDBC standard is similar to ODBC, for Java
Transaction Server Process Structure
Send user queries
(transactions) to server.
Monitors other processes,
and takes recovery actions
if any of the other
Receive user queries processes fail.
(transactions), execute
them and send results
back.
Lock manager functionality
like lock grant, lock
release, and deadlock
Output log records from detection.
the log record buffer to
stable storage.
Output modified buffer
Blocks back to disk on a
Performs periodic continuous basis.
Checkpoints.
Data Servers
■ Used in highspeed LANs, in cases where
● High speed connection between the clients and the server.
● Client machines are comparable in processing power to the server
machine.
■ Data are shipped to clients where processing is performed, and then
shipped data back to the server.
■ This architecture requires full backend functionality at the clients.
■ Used in many objectoriented database systems
■ Issues:
● PageShipping versus ItemShipping
● Locking
● Data Caching
● Lock Caching
Data Servers (Cont.)
■ Pageshipping versus itemshipping
● Unit of communication for data may be coarse granularity (page)
or fine granularity (tuple or item)
● Item shipping ⇒ overhead of message passing is high.
● Page shipping ⇒ prefetching
 Fetching items even before they are requested is called
prefetching.
 All the items in the page are shipped when a process desires
to access a single item in the page.
■ Locking
● Locks granted by the server for the data items that it ships to the
client machines.
● Disadvantage of page shipping :
 Lock on a page implicitly locks all items contained in the page
even client is not accessing some items in the page.
● Other client machines that require locks on those items may be
blocked unnecessarily.
● Techniques for lock deescalated have been proposed where the
server can request its clients to transfer back locks on prefetched
items.
■ Data Caching
● Data can be cached at client even in between transactions
● Even if a transaction finds cached data, it must make sure that those
data are up to date (cache coherency)
■ Lock Caching
● Locks can be retained by client system even in between transactions
● Server calls back locks from clients when it receives conflicting lock
request. Client returns lock once no local transaction is using it.
Parallel Systems
■ Parallel systems improve processing and I/O speeds by using multiple
CPUs and disks in parallel.
■ Useful for applications that have to query extremely large databases or
that have to process an extremely large number of transactions per
second.
■ A coarsegrain parallel machine consists of a small number of
powerful processors
■ A massively parallel or fine grain parallel machine utilizes
thousands of smaller processors.
■ Two main measures of performance of database system:
● throughput the number of tasks that can be completed in a
given time interval
● response time the amount of time it takes to complete a single
task from the time it is submitted
Speedup
■ Running a given task in less time by increasing the degree of
parallelism is called speedup.
■ Suppose,
● execution time of task on the larger machine is TL
● execution time of same task on the smaller machine is TS
■ A fixedsized problem executing on a small system is given to a
system which is Ntimes larger.
● Measured by:
speedup = small system elapsed time = TS
large system elapsed time TL
● Speedup is linear if equation equals N.
● Speedup is sublinear if equation less than N.
Speedup
Speedup with increasing resources

Scaleup
■ Handling larger tasks by increasing degree of parallelism is called
scaleup.
■ Relates to the ability to process larger tasks in the same amount of
time by providing more resources.
■ Increase the size of both the problem and the system
● Ntimes larger system used to perform Ntimes larger task
● Measured by:
scaleup = small system small problem elapsed time = TS
big system big problem elapsed time TL
● Scaleup is linear if equation equals 1 (TS=TL).
● Scaleup is sublinear if TL>TS.
Scaleup
Scaleup with increasing problem size and resources

Factors Limiting Speedup and Scaleup
Speedup and scaleup are often sublinear due to:
■ Startup costs: Cost of starting up multiple processes may dominate
computation time, if the degree of parallelism is high.
■ Interference: Processes accessing shared resources (e.g.,system
bus, disks, or locks) compete with each other, thus spending time
waiting on other processes, rather than performing useful work.
■ Skew: By breaking down a single task into a number of parallel steps,
we reduce the size of the average steps.
● It is often difficult to divide a task into exactly equalsized parts,
which decreased the speedup.
Interconnection Architectures
Parallel Database Architectures
Distributed Systems
■ Data spread over multiple machines (also referred to as sites or
nodes).
■ Computers in a distributed system communicate with one another
through various communication media, such as highspeed networks or
telephone lines.
Distributed Databases
■ Homogeneous distributed databases
● Same software/schema on all sites, data may be partitioned
among sites
■ Heterogeneous distributed databases
● Different software/schema on different sites
■ Differentiate between local and global transactions
● A local transaction accesses data in the single site at which the
transaction was initiated.
● A global transaction either accesses data in a site different from
the one at which the transaction was initiated or accesses data in
several different sites.
Tradeoffs in Distributed Systems
■ Sharing data – users at one site able to access the data residing at
some other sites.
■ Autonomy – each site is able to retain a degree of control over data
stored locally.
■ Higher system availability through redundancy — data can be
replicated at remote sites, and system can function even if a site fails.
■ Disadvantage: added complexity required to ensure proper
coordination among sites.
● Software development cost.
● Greater potential for bugs.
● Increased processing overhead.
Network Types
■ Localarea networks (LANs) – composed of processors that are
distributed over small geographical areas, such as a single building or
a few adjacent buildings.
■ Widearea networks (WANs) – composed of processors distributed
over a large geographical area.
■ WANs with continuous connection (e.g. the Internet) are needed for
implementing distributed database systems
End of Unit

Rdbms (Unit 3)

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Rdbms (Unit 3)

Caricato da

Copyright:

Formati disponibili

Unit 3 : Database System Architectures

Speedup with increasing resources

Scaleup with increasing problem size and resources

Potrebbero piacerti anche