Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Group:- COE-8
DBMS ASSIGNMENT
Q1:-Explain the concept of distributed database with proper diagram and explanation
and what the advantage of distributed database over DBMS is.
Ans:- A distributed database is a database in which storage devices are not all attached to a
common processing unit such as the CPU, controlled by a distributed database management
system (together sometimes called a distributed database system). It may be stored in
multiple computers, located in the same physical location; or may be dispersed over
a network of interconnected computers. Unlike parallel systems, in which the processors are
tightly coupled and constitute a single database system, a distributed database system consists
of loosely coupled sites that share no physical components.
Advantages of Distributed Database System
1. Increased reliability and availability A distributed database system is robust to failure to
some extent. Hence, it is reliable when compared to a Centralized database system.
2. Local control The data is distributed in such a way that every portion of it is local to
some sites (servers). The site in which the portion of data is stored is the owner of the data.
3. Modular growth (resilient) Growth is easier. We do not need to interrupt any of the
functioning sites to introduce (add) a new site. Hence, the expansion of the whole system is
easier. Removal of site is also does not cause much problems.
4. Lower communication costs (More Economical) Data are distributed in such a way that
they are available near to the location where they are needed more. This reduces the
communication cost much more compared to a centralized system.
5. Faster response Most of the data are local and in close proximity to where they are
needed. Hence, the requests can be answered quickly compared to a centralized system.
6. Reflects the organizational structure Normally, database is fragmented into various
locations wherever we have controls.
7. Secured management of distributed data Various transparencies like network
transparency, fragmentation transparency, and replication transparency are implemented to
hide the actual implementation details of the whole distributed system. In such way,
Distributed database provides security for data.
8. Robust The system is continued to work in case of failures. For example, replicated
distributed database performs in spite of failure of other sites.
9. Complied with ACID properties Distributed transactions demands Atomicity,
Consistency, Isolation, and Reliability.
10. Improved performance and Parallelism in executing transactions can be achieved
Q2:- Discuss the concept of data mining and data warehousing and its applications with
proper explanation.
Ans:- Data Mining:- Data mining (the analysis step of the "knowledge discovery in
databases" process, or KDD), an interdisciplinary subfield of computer science is the
computational process of discovering patterns in large data sets ("big data") involving
methods at the intersection of artificial intelligence, machine learning, statistics, and database
systems. The overall goal of the data mining process is to extract information from a data set
and transform it into an understandable structure for further use. Aside from the raw analysis
step,
it
involves
database
and data
management aspects, data
preprocessing, model and inference considerations,
interestingness
metrics, complexity considerations, post-processing of discovered structures, visualization,
and online updating.
Data Warehousing :- In computing, a data warehouse (DW or DWH), also known as
an enterprise data warehouse (EDW), is a system used for reporting and data analysis. DWs
are central repositories of integrated data from one or more disparate sources. They store
current and historical data and are used for creating analytical reports for knowledge workers
throughout the enterprise. Examples of reports could range from annual and quarterly
comparisons and trends to detailed daily sales analyses.
Applications of Data Mining and Data Warehousing
Retail Industry
Telecommunication Industry
Intrusion Detection
The applications discussed above tend to handle relatively small and homogeneous data sets
for which the statistical techniques are appropriate. Huge amount of data have been collected
from scientific domains such as geosciences, astronomy, etc. A large amount of data sets is
being generated because of the fast numerical simulations in various fields such as climate
and ecosystem modelling, chemical engineering, fluid dynamics, etc. Following are the
applications of data mining in the field of Scientific Applications
Graph-based mining.
Intrusion Detection
Intrusion refers to any kind of action that threatens integrity, confidentiality, or the
availability of network resources. In this world of connectivity, security has become the
major issue. With increased usage of internet and availability of the tools and tricks for
intruding and attacking network prompted intrusion detection to become a critical
component of network administration. Here is the list of areas in which data mining
technology may be applied for intrusion detection
Development of data mining algorithm for intrusion detection.
Association and correlation analysis, aggregation to help select and build
discriminating attributes.
Analysis of Stream data.
Distributed data mining.
Visualization and query tools.