Sei sulla pagina 1di 5

ADVANTAGE & DISADVANTAGE OF REPLICATION

INTRODUCTION

Data Replication is the process of storing data in more than one site or node. This is
necessary for improving the availability of data.

Replication is the process of copying (duplicating) and maintaining database objects


in multiple databases that make up a distributed database system taken from Oracle
Documentation.

ADVANTAGES OF DATA REPLICATION

There are following advantages of replication:

Availability

If one of the sites containing relation R fails, then the relation R can be obtained from
another site. Thus, queries (involving relation R) can be continued to be processed in spite of
the failure of one site.

Increased parallelism

The sites containing relation R can process queries (involving relation R) in parallel
This leads to faster query execution.

Increased reliability and availability

We have many copies of same data in several different locations (usually different
geographical locations). Hence, failure of any sites (servers) will not affect the transactions.

Queries requesting replicated copies of data are always faster (especially read queries)

Distributed database ensures the availability of data where it is needed much. In case
of replication, this is one step ahead. Yes, the complete table itself loaded locally. Hence,
those queries can be answered quickly from the local site where they are initiated.

1
Less communication overhead

When more number of read queries is generated in a site, all of them can be answered
locally. Only the queries involving different table or the queries try to write something need
to use the communication links to contact other sites.

Less Data Movement over Network

The more replicas of, a relation are there, the greater are the chances that the required
data is found where the transaction is executing. Hence, data replication reduces movement
of data among sites and increases speed of processing.

Other benefits include:

Replication offers various benefits depending on the type of replication and the
options one choose, but the common benefit of replication is the availability of
data when and where it is needed.

Allowing multiple sites to keep copies of the same data. This is useful when
multiple sites need to read the same data or need separate servers for reporting
applications.

Separating OLTP applications from read-intensive applications such as online


analytical processing (OLAP) databases, data marts, or data warehouses.

Allowing greater autonomy. Users can work with copies of data while
disconnected and then propagate changes they make to other databases when they
are connected.

Scale out of data to be browsed, such as browsing data using Web-based


applications.

Increasing aggregate read performance.

Bringing data closer to individuals or groups. This helps to reduce conflicts based
on multiple user data modifications and queries because data can be distributed

2
throughout the network, and one can partition data based on the needs of different
business units or users.

Using replication as part of a customized standby server strategy. Replication is


one choice for standby server strategy.

Other choices in SQL Server 2000 include log shipping and failover clustering,
which provide copies of data in case of server failure.

Its a backup for disaster recovery. If the primary site is hit with a natural disaster,
power outage, fire, etc, the replicated database in a secondary location can be
utilized to prevent system downtime.

DISADVANTAGES OF DATA REPLICATION

There are following disadvantages of replication:

Require more disk space

Storing replicas of same data at different sites consumes more disk space.

More storage space is needed when compared to a centralized system

Replication would mean to duplicate any tables and store them in every site. This
need more space in every site.

Increased overhead on update

When an update is required, a database system must ensure that all replicas are
updated. If we have more copies of same data loaded in different sites, obviously we need to
update all the replicas whenever we would like to change data. Hence, write operation is
always costly.

Maintaining data integrity is complex

It involves complex procedures to maintain consistent database.

Expensive

3
Concurrency control and recovery techniques will be more advanced and hence more
expensive. In general, replication enhances the performance of read operations and increases
the availability of data to read-only transactions. However, update transactions incur greater
overhead. Controlling concurrent updates by several translations to replicated data is more
complex than is using the centralized approach to concurrency control.

We can simplify the management of replicas of relation r by choosing one of them as


the primary copy of r. For example, in a banking system, an account can be associated the site
in which the account has been opened. Similarly, in an airline-reservation system, a flight can
be associated with the site at which the flight originates. We can further break multitasking
into process based and thread based.

PROCESS

If the two sites update completely different tables, then something like Dropbox might
work for that. Dropbox does not synchronize/merge the contents of files. That means if both
site A and site B updated some file, then one would be responsible for writing the code to
merge the changes. Advantage Database Server has support for replication built in natively,
so that would likely be the simplest solution. Advantage replication is performed on a record-
by-record basis and is handled asynchronously. If the target database cannot be reached, the
updates are stored in a queue and processed periodically.

If the connection between the two sites is open / available constantly, the lag between
the source update and the replicated update is typically small but obviously depends on the
network bandwidth and latency. One could use a VPN for the connection between the two
sites, but it would not be required. If one does not use some kind of VPN, though, one should
make sure the communication is encrypted between the two sites (it is an option when setting
up the subscriptions). Edit For the communication, all one need is "normal" network
connectivity. The primary issue is dealing with things like firewalls and NAT.

With Advantage, one defines which port it uses. If one uses a TCP/IP connection, one
would need to make sure the configured port allows inbound connections to the ads.exe
process. One can use UDP as well, but if one is dealing with firewalls, it is probably going to
be simpler with TCP. Ones question about duplicate keys is a good one. If both sites either
add a record with the same primary key or update the same record concurrently, then it results

4
in a conflict. There is an option to simply ignore conflicts in which case the last update wins.
More realistically, one would want to write an ON CONFLICT trigger to handle the conflicts.

CONCLUSION

There can be full replication, in which a copy of the whole database is stored at every
site. There can also be partial replication, in which case, some fragment (important
frequently used fragments) of the database are replicated and others are not replicated. There
are a number of advantages and disadvantages to replication. Data replication is the process
where in a relation (a table) or portion of a relation (a fragment of a table) is duplicated and
those duplicated copies are stored in multiple sites (servers) to increase the availability of
data.

REFERENCES

http://ecomputernotes.com/database-system/adv-database/data-replication
http://exploredatabase.blogspot.in/2014/08/advantages-and-disadvantages-of-data-
replication-in-distributed-databases.html
https://answers.yahoo.com/question/index?qid=20061117132416AAGSyKo
http://stackoverflow.com/questions/4698943/advantage-database-replication

Potrebbero piacerti anche