Sei sulla pagina 1di 11

Distributed Systems

Assignment 1
Sharayu H. Fukey, Roll No. 152191012 ,M. Tech (S.E.) First
Year, V.J.T.I.
11/09/2015

Question 1 - Give various definitions of Distributed System.


Answer :
1. A distributed system is a collection of independent computers that
appears to its users as a single coherent system.
2. Distributed Systems are also defined as autonomous computers
connected by a network.
3. Distributed System is a software specifically designed to provide an
integrated computing facility.
4. Definition by George Coulouris: A distributed system is one in which
components located at networked computers communicate and
coordinate their actions only by passing messages.
5. A popular definition of a distributed system, by Leslie Lamport, is "You
know you have one when the crash of a computer you've never heard of
stops you from getting any work done."

Question 2 - Discuss various goals of the distributed system.


Answer :

1. Making Resources Accessible

The main goal of a distributed system is to make it easy for the users (and
applications) to access remote resources, and to share them in a controlled and
efficient way. Resources can be just about anything, but typical examples include
things like printers, computers, storage facilities, data, files, Web pages, and
networks, to name just a few. There are many reasons for wanting to share
resources.
The major reason behind this goal is to save cost. It is cheaper to let a printer be
shared by several users in a small office than having to buy and maintain a
separate printer for each user. Likewise, it makes economic sense to share costly
resources such as supercomputers, high-performance storage systems, image
setters, and other expensive peripherals.
2. Distribution Transparency
An important goal of a distributed system is to hide the fact that its processes
and resources are physically distributed across multiple computers. A distributed
system that is able to present itself to users and applications as if it were only a
single computer system is said to be transparent.
3. Openness
Another important goal of distributed systems is openness. An open
distributed system is a system that offers services according to standard rules
that describe the syntax and semantics of those services. For example, in
computer networks, standard rules govern the format, contents, and meaning of
messages sent and received. Such rules are formalized in protocols. In
distributed systems, services are generally specified through interfaces, which
are often described in an Interface Definition Language (IDL) . They specify
precisely the names of the functions that are available together with types of the
parameters, return values, possible exceptions that can be raised, and so on. The

hard part is specifying precisely what those services do, that is, the semantics of
interfaces. In practice, such specifications are always given in an informal way by
means of natural language.
If properly specified, an interface definition allows an arbitrary process that
needs a certain interface to talk to another process that provides that interface.
It also allows two independent parties to build completely different
implementations of those interfaces, leading to two separate distributed systems
that operate in exactly the same way.
Another important goal for an open distributed system is that it should be easy
to configure the system out of different components (possibly from different
developers).
Also, it should be easy to add new components or replace existing ones without
affecting those components that stay in place. In other words, an open
distributed system should also be extensible. For example, in an extensible
system, it should be relatively easy to add parts that run on a different operating
system or even to replace an entire file system.
4. Scalability
Scalability is one of the most important design goals for developers of distributed
systems. Scalability of a system can be measured along at least three different
dimensions.
1. System can be scalable with respect to its size, meaning that we can
easily add more users and resources to the system.
2. System can be geographically scalable system, that is one in which the
users and resources may lie far apart.
3. System can be administratively scalable, meaning that it can still be easy
to manage even if it spans many independent administrative
organizations.
Unfortunately, a system that is scalable in one or more of these dimensions often
exhibits some loss of performance as the system scales up.

Question 3 What is transparency with reference to


distributed system? Discuss various types of transparencies
with example. According to you one is most important.
Answer:
An important goal of a distributed system is to hide the fact that its processes
and resources are physically distributed across multiple computers. A distributed
system that is able to present itself to users and applications as if it were only a
single computer system is said to be transparent. Let us first take a look at what
kinds of transparency exist in distributed systems.
Types of Transparency
The concept of transparency can be applied to several aspects of a distributed
system, the most important ones shown in the table below -

1. Access transparency
Access transparency deals with hiding differences in data
representation and the way that resources can be accessed by users. At a basic
level, we wish to hide differences in machine architectures, but more important is
that we reach agreement on how data is to be represented by different machines
and operating systems. For example, a distributed system may have computer
systems that run different operating systems, each having their own file-naming
conventions. Differences in naming conventions, as well as how files can be
manipulated should all be hidden from users and applications. An important
group of transparency types has to do with the location of a resource.
2. Location transparency
Location transparency refers to the fact that users cannot tell where a
resource is physically
located in the system. Naming plays an important role in achieving location
transparency. In particular, location transparency can be achieved by assigning
only logical names to resources, that is, names in which the location of a
resource is not secretly encoded. An example of a such a name is the URL
http://www.prenhall.com/index.html which gives no clue about the location of
Prentice Hall's main Web server. The URL also gives no clue as to whether
index.html has always been at its current location or was recently moved there.
3. Migration transparency
Distributed systems in which resources can be moved without affecting
how those resources can be accessed are said to provide migration
transparency. Even stronger is the situation in which resources can be
relocated while they are being accessed without the user or application noticing
anything. In such cases, the system is said to support relocation transparency. An
example of relocation transparency is when mobile users can continue to use
their wireless laptops while moving from place to place without ever being
(temporarily) disconnected.
4. Replication transparency
Replication plays a very important role in distributed systems. For
example, resources may be replicated to increase availability or to improve
performance by placing a copy close to the place where it is accessed.

Replication transparency deals with hiding the fact that several copies of a
resource exist. To hide replication from users, it is necessary that all replicas
have the same name. Consequently, a system that supports replication
transparency should generally support location transparency as well, because it
would otherwise be impossible to refer to replicas at different locations.
5. Concurrency transparency
In many cases, sharing resources is done in a cooperative way, as in the
case of communication. For example, two independent users may each have
stored their files on the same file server or may be accessing the same tables in
a shared database. In such cases, it is important that each user does not notice
that the other is making use of the same resource. This phenomenon is called
concurrency transparency. An important issue is that concurrent access to a
shared resource leaves that resource in an inconsistent state. Consistency can be
achieved through locking mechanisms, by which users are, in turn, given
exclusive access to the desired resource.
6. Failure transparency
Making a distributed system failure transparent means that a user
does not notice that a resource (he has possibly never heard of) fails to work
properly, and that the system subsequently recovers from that failure. Masking
failures is one of the hardest issues in distributed systems and is even impossible
when certain apparently realistic assumptions are made. The main difficulty in
masking failures lies in the inability to distinguish between a dead resource and a
painfully slow resource. For example, when contacting a busy Web server, a
browser will eventually time out and report that the Web page is unavailable. At
that point, the user cannot conclude that the server is really down.

Question 4 - What is open distributed system? Discuss its


advantages.
Answer :
An open distributed system is a system that offers services according to
standard rules that describe the syntax and semantics of those services. For
example, in computer networks, standard rules govern the format, contents, and
meaning of messages sent and received. Such rules are formalized in protocols.
In distributed systems, services are generally specified through interfaces, which
are often described in an Interface Definition Language (IDL). Interface
definitions written in an IDL capture only the syntax of services. In other words,
they specify the names of the functions that are available together with types of
the parameters, return values, possible exceptions that can be raised, and so on.
The hard part is specifying precisely what those services do, that is, the
semantics of interfaces. In practice, such specifications are always given in an
informal way by means of natural language.
If properly specified, an interface definition allows an arbitrary process
that needs a certain interface to talk to another process that provides that
interface. It also allows two independent parties to build completely different
implementations of those interfaces, leading to two separate distributed systems
that operate in exactly the same way.

An important goal for an open distributed system is that it should be easy


to configure the system out of different components (possibly from different
developers).
Also, it should be easy to add new components or replace existing ones without
affecting those components that stay in place. In other words, an open
distributed system should also be extensible.
Proper specifications are complete and neutral. Complete means that
everything that is necessary to make an implementation has indeed been
specified. Completeness and neutrality are important for interoperability and
portability. Interoperability
characterizes
the
extent by which
two
implementations of systems or components from different manufacturers can coexist and work together by merely relying on each other's services as specified
by a common standard. Portability characterizes to what extent an application
developed for a distributed system A can be executed without modification, on a
different distributed system B that implements the same interfaces as A.
Advantages of open distributed systems:
1. Open systems able to interact with services from other open systems,
irrespective of the underlying environment. Open distributed systems are
based on the provision of a uniform communication mechanism and
published interfaces for access to shared resources.
2. They are extensible. They may be extended at the hardware level by the
addition of computers to the network and at the software level by the
introduction of new services and the reimplementation of old ones,
enabling application programs to share resources.
3. A further benefit that is often cited for open systems is their independence

from individual vendors. Open distributed systems can be constructed


from heterogeneous hardware and software, possibly from different
vendors. But the conformance of each component to the published
standard must be carefully tested and verified if the system is to work
correctly. For example, in the early days of Web mapping, you needed the
map server and the map viewer (or client) to be from the same vendor.
With Open systems, all spatial clients and servers are able to request and
provide map images regardless of what company wrote the software .

Question 5 - Give different examples of Distributed system.


With the help of neat diagram discuss distributed system
available in your department.
Answer:
Different examples of distributed systems are 1. The Internet
Internet is a vast interconnected collection of computer networks of many
different types, with the range of types increasing all the time. It is a very
large distributed system. It enables users, wherever they are, to make use
of services such as the World Wide Web, email and file transfer.
The figure below illustrates a typical portion of the Internet. Programs
running on the computers connected to it interact by passing messages,

employing a common means of communication. The figure shows a


collection of intranets subnetworks operated by companies and other
organizations and typically protected by firewalls. Internet Service
Providers (ISPs) are companies that provide broadband links and other
types of connection to individual users and small organizations, enabling
them to access services anywhere in the Internet as well as providing local
services such as email and web hosting. The intranets are linked together
by backbones. A backbone is a network link with a high transmission
capacity.

2. Distributed multimedia systems


A distributed multimedia system should be able to store and locate audio
or video files, to transmit them across the network , support the
presentation of the media types to the user and optionally also to share
the media types across a group of users. Distributed multimedia systems
often use Internet infrastructure.
The benefits of distributed multimedia computing are considerable in that
a wide range of new (multimedia) services and applications can be
provided on the desktop, including access to live or pre-recorded television
broadcasts, access to film libraries offering video-on-demand services,
access to music libraries, the provision of audio and video conferencing
facilities and integrated telephony features including IP telephony or
related technologies such as Skype.
3. Web search

Google, the market leader in web search technology, has put significant
effort into the design of a sophisticated distributed system infrastructure
to support search. This represents one of the largest and most complex
distributed systems installations. Highlights of this infrastructure include:
an underlying physical infrastructure consisting of very large numbers of
networked computers located at data centres all around the world
a distributed file system designed to support very large files and heavily
optimized for the style of usage required by search and other Google
applications (especially reading from files at high and sustained rates)

an associated structured distributed storage system that offers fast access


to very large datasets
a lock service that offers distributed system functions such as distributed
locking and agreement
a programming model that supports the management of very large
parallel and distributed computations across the underlying physical
infrastructure

4. Network File Systems


Network file system is any file system that allows access to files from
multiple hosts sharing via a computer network. This makes it possible for
multiple users on multiple machines to share files and storage resources.
Network File System (NFS), was originally developed by SUN Microsystems
for remote access support in a UNIX context.

Question 6 - Discuss reasons for developing Distributed


Shared Memory. What are the major issues in its
implementation?
Answer:

Distributed shared memory (DSM) is an abstraction used for sharing data


between computers that do not share physical memory. Processes access DSM
by reads and updates to what appears to be ordinary memory within their
address space. However, an underlying runtime system ensures transparently
that processes executing at different computers observe the updates made by
one another. It is as though the processes access a single shared memory, but in
fact the physical memory is distributed
Reasons for developing DSM
1. Common global memory view for the programmer and ease of
programming
Earlier in distributed systems, multiprocessors relied on distributed
memory, where processing nodes have access only to their local memory,
and access to remote data was accomplished by request and reply
messages. Most programmers find message passing very hard to do especially when they want to maintain a sequential version of program
(during development and debugging stages) as well as the message
passing version. Programmers often have to approach the two versions
completely independently. They in general feel more comfortable in
viewing the data in a common global memory, hence programming on a
shared memory multiprocessor system is considered easier. In a shared
memory paradigm, all processes (or threads of computation) share the
same logical address space and access directly any part of the data
structure in a parallel computation. A single address space enhances the
programmability of a parallel machine by reducing the problems of data
partitioning, migration and load balancing.
2. Easier and efficient way to handle complex and large data bases without
replication or sending the data to processes.

3. Less expensive option DSM is less expensive to build than tightly coupled multiprocessor system:
off-the-shelf hardware, no expensive interface to shared physical memory.
4. Efficient way to run large programs
The shared memory model provides a virtual address space shared
between all nodes.
There is a very large total physical memory available for all nodes. Hence,
large programs can run more efficiently.
5. DSM provides non serial access to common bus for shared physical
memory like in multiprocessor systems.
Following are the various implementation issues in DSM 1. Structure of data held in DSM -

Different approaches to DSM vary in what they consider to be an object


and in how objects are addressed. There are three approaches, which view
DSM as being composed respectively of contiguous bytes, language-level
objects or immutable data items.
Byte-oriented: This type of DSM is accessed as ordinary virtual

memory a contiguous array of bytes.


Object-oriented: The shared memory is structured as a collection of

language-level objects.
Immutable data: Here DSM is viewed as a collection of immutable
data items that processes can read, add to and remove from.

2. Synchronization model used to access DSM consistently at the application

level In order to use DSM, then, a distributed synchronization service needs to


be provided, which includes familiar constructs such as locks and
semaphores. Even when DSM is structured as a set of objects, the
implementors of the objects have to be concerned with synchronization.
3. DSM consistency model, which governs the consistency of data values

accessed from different computers A memory consistency model specifies the consistency guarantees that a
DSM system makes about the values that processes read from objects,
given that they actually access a replica of each object and that multiple
processes may update the objects.
4. Update options for communicating written values between computers

Two main implementation choices have been devised for propagating


updates made by one process to the others: write-update and writeinvalidate.
Write-update: The updates made by a process are made locally and
multicast to all other replica managers possessing a copy of the data item.

Write-invalidate: This is commonly implemented in the form of multiplereader/ single-writer sharing.


5. Granularity of sharing in a DSM implementation
Conceptually, all processes share the entire contents of a DSM. As
programs sharing DSM execute, however, only certain parts of the data
are actually shared and then only for certain times during the execution. It
would clearly be very wasteful for the DSM implementation always to
transmit the entire contents of DSM as processes access and update it.
What should be the unit of sharing in a DSM implementation? That is,
when a process has written to DSM, which data does the DSM runtime
send in order to provide consistent values elsewhere?
6. Thrashing

Thrashing is said to occur where the DSM runtime spends an inordinate


amount of time invalidating and transferring shared data compared with
the time spent by application processes doing useful work.

Question 7 - What is a micro kernel? Discuss benefits of


micro kernel. Answer:
In the microkernel design, the kernel provides only the most basic
abstractions, principally address spaces, threads and local interprocess
communication; all other system services are provided by servers that are
dynamically loaded at precisely those computers in the distributed system that
require them. Clients access these system services using the kernels messagebased invocation mechanisms.
The place of the microkernel in its most general form in the overall distributed
system design is shown in below -

The microkernel appears as a layer between the hardware layer and a layer
consisting of major system components called subsystems. If performance is the
main goal, rather than portability, then middleware may use the facilities of the
microkernel directly. Otherwise, it uses a language runtime support subsystem,
or a higher-level operating system interface provided by an operating system
emulation subsystem. Each of these, in turn, is implemented by a combination of
library procedures linked into applications and a set of servers running on top of
the microkernel.

The chief advantages of a microkernel-based operating system are 1. Service separation has the advantage that if one service (called a server)
fails others can still work, so reliability is the primary feature. For example
if a device driver crashes, it does not cause the entire system to crash.
Only that driver need to be restarted rather than having the entire system
die. This means more persistence as one server can be substituted with
another. It also means maintenance is easier.
2. Different services are built into special modules which can be loaded or
unloaded when needed. Patches can be tested separately then swapped to
take over on a production instance.
3. Message passing allows independent communication and allows
extensibility.
4. In addition, a relatively small kernel is more likely to be free of bugs than
one that is larger and more complex.

Potrebbero piacerti anche