Sei sulla pagina 1di 20

Data Mining Strategy

for
Database Buffer Management




Presented By:
Ajay Kumar Priyadarshi
B.Tech Final Yr,CSE 0604210004
Madan Mohan Malviya Engineering College
Gorakhpur (U.P.)


Introduction to Buffer management
Buffer management Approaches
Clock and G-Clock.
LRD.
FBRD.
LRU-K
2Q.
Proposal for DATA MINING
Conclusion
References



Efficient main memory allocation and replacement for database queries.

Reduce disk operation,enhance system throughput by utlilising dedicated
buffer pool for caching databases pages.

DBMS uses main-memory as a buffer to reduce disk access.

Buffer is divided into frames of same size and each frame contains the page
of secondary storage file.

When page request occurs that is not in the buffer pool,the buffer pool
manager selects the appropriate frame from list of frames.If frame is not free
than Buffer Manager determines the page currently in buffer pool should be
replaced.
Least Recently Used(LRU).

Clock And G-Clock Strategy.

Least Reference Density Strategy.

Frequency based Replacement Strategy.

LRU-K Strategy.

2Q Strategy.
Least Recently Used -When a new frame Buffer is needed,the page
in the buffer that has not been accessed for the longest time is replaced.

Clock and G-Clock Strategy -

Clock Strategy
Adds a Use-bit to every buffer page.
Use-bit indicates whether page referenced or not.
0-bit page is victim for replacement and reset to 1.
Encounting 1-bit page is reset to 0 and causes move
to next page.
Contd
G-Clock Strategy
Replaces Use-bit of a buffer page Pi with a
reference counter RCi(initialized to a weight equal to
1 or larger than 1).
References to Pi increments the corresponding RCi.
When a Buffer Fault occurs Victim page Pi is found by
decrementing RCi by 1 until 0 value is found


Similar to G-Clock,except that it relates actual no. of Pi
referenced in RCi to Page Age(total no. of elasped
references to all pages since Pis first reference) .
Least Reference Density Strategy -
Contd
Least Reference Density Strategy --
Age Of Page = GRC FC(i). Where GRC is the total
no.of references and FC(i) is the reference at its first
references.
RD(i)=RC(i)/GRC(i)/FC(i),where RD(i) is the
reference density of a page.
LRD Assumes equidistant arrival of page references.
LRD algorithm always chooses the buffer page with
lowest RD(i).

Problem : High Reference Density at the beginning of interval keeps
pages in the buffer much longer the desired.

Contd
Frequency Based Replacement Strategy -
Similar to LRD,except that it considers factoring out
locality from Reference Counts(RCis) through
whole page age.
FBR frame contains three refinements :
New Section : controls increments of reference counts.
Old Section : confine the replacement choices that have not been
too recently referenced.
Adjust section: periodically adjusting reference counts so as too
approximate references frequencies.
FBR sets portion of frame stack as new section
When page hits,its RCis not increment,,otherwise
incremented
Contd
It sets portion of bottom frame stack as old section.
All replacement candidates come from this
section.

The remaining part sets as middle section.
Allows frequently occurred pages to build up their
references counts.

Problem : count overflow.
Certain pages build up their high reference counts and
never being replaced.

Contd
LRU-K Strategy -
Problem with LRU is that it gaurantees a long buffer
life,even if the page will never been refrenced again.
LRU-K solves this problem by considering access
history for each page.
Keep tracks of time of last K-refreances to all pages
in presence of all accessing patterns.
When a buffer frame is needed for a new page being
read,the page P to be dropped is one with Backward
K-distance bt(p,k) is maximum for all pages in buffer.

Strength : LRU-K has significant cost/performance advantages over
conventional algorithms like LRU,sometimes as high
as 40%.

Contd
2Q Strategy -
Based on LRU-K,except ,It test pages for their last
second refrence time.
2Q divides buffer B into three queues :
Am Queue : LRU queue, It keeps frequently accessed pages.
Alin Queue :FIFO queue,It retains newly correlated reference pages.
Alout Queue :FIFO queue,used to detect pages that have high long
term access rates.

On the first reference to a page, 2Q places it into Alin.
If a page in Alin is accessed, it isnt promoted to Am queue because the
second access may simply be correlated refrence.


Contd
When a page fault occurs, if the length of Alin exceeds a certain threshold,
the older first accessed page is thrown out of Alin but remembered in Alout.

At this time, if Alout also exceeds its length threshold, then discard the
oldest page from alout.

If Alin queue has not yet achieved its predetermined length, then replace
the least recently used page in Am queue.
Proposal for Data Mining for Buffer
Management
The proposed approach discovers knowledge from database access
sequences and uses it to guide buffer management .

Figure shows the components
Page
Request


Access Information
Collector
Monitor & Controller
Access Knowledge
Miner
Buffer Manager
Database
buffer
Access
Knowledge
database
Access
Information
Database
Contd
Buffer manager : Manages all operations on the buffer.

Monitor & Controller :Controls mining time and frequency by monitoring
pre-specified performance metrics of buffer management.

Information Collector :collects and stores page requests of each

transaction in an access information database.

Access Knowledge Miner :mines access knowledge from the access
Information database off-line.


On performing a series of Experiments to assess the performance.

Contd.









From the test results presented in table1, we found that the proposed data
mining based approach can achieve much higher buffer hit ratio than LRU.

Utilizing knowledge discovered from page access history is important for
managing databse buffer.
Buffer size 2.5% 5.0% 7.5% 10.0% 2.5%
DMBMA BUFFER
HIT RATIO
11.34% 16.9% 20.65% 22.04% 22.62%
LRU BUFFER HIT
RATIO
3.49% 7.24% 11.09% 14.43% 17.48%
Summary
LRU-K:
Performs better then other approaches by access k page
references instead of using 1 page reference.
vulnerable to coordinator failure at certain times causes blocking.
Data mining based buffer managemant approach:
Applying knowledge discovered from database access history
to database buffer management.
Achieve better System performance.

References
C.M. Chen and N. Roussopoulos. Adaptive database buffer allocation
using query feedback.In Proc. of the 19th Conference on Very Large
Data Bases, pages 342-353, Dublin, Ireland,August 1993.
H. Chen, J.X. Yu, K. Yamaguchi, H. Kitagawa,N. Ohbo and Y.
Fujiwara. Lru-s: A new buffer allocation approach for oodbms. In
Proc. of the1990 ACM SIGMOD International Conference on
management of data.
J. Rodriguez-Rosell. Empirical data reference behavior in data base
systems. IEEE Computer, Volume 9, Number 11, November 1976.
M. Stonebraker. Operating system support for database
management. Communications of the ACM, Volume 24, Number 7,
pages 412-417, July 1981.
References
T. Johnson and D. Shasha. 2Q: A low overhead high performance buffer
management replacement algorithm. In Proc. of the 20th Conference on
Very Large Data Bases, pages 439-450,Santiago, Chile, September 1994.
W. Effelsberg and T. Haerder. Principles of database buffer
management. AGM Transaction on Database Systems, Volume 9,
Number 4, pages 560-595, December 1984.
L. Feng, H. Lu, Y.C. Tay and K.H. Tung. Buffer management in
distributed database systems:A data mining based approach. In Proc.
Of the 6th International Conference on Extending Database Technology,
Valencia, Spain, March 1998.
A. Dan, P.S. Yu and J-Y. Chung. Characterization of database access
pattern for analytic prediction of buffer hit probability. VLDB Journal,
Volume 4, Number 1, pages 127-155, January 1995.
Queries.

Potrebbero piacerti anche