Distributed Operating Systems

Distributed Operating
Systems
By:
Akshay Dabholkar
Mayur Palankar
Amol Pandit
Based on the paper by Andrew S. Tanenbaum and Robbert Van Renesse
Outline
What is a Distributed Operating System ?
How is it different ?
Why Distributed Operating Systems ?
Problems with Distributed Operating Systems
Distributed Operating System Models
Design Issues
Comparison of some Distributed Operating Systems
Conclusion
What is a Distributed Operating System ?
A Distributed Operating System is the one that runs on multiple,

autonomous CPUs which provides its users an illusion of an ordinary
Centralized Operating System that runs on a Virtual Uniprocessor.
Distributed Operating Systems provide resource transparency to the user

processes.
“If you can tell which computer you are using, you are not using a
distributed operating system.” - Tanenbaum
How is it different ?
The Distributed Operating System is unique and resides on different CPUs.
User processes can run on any of the CPUs as allocated by the Distributed
Operating System.
Data can be resident on any machine that is the part of the Distributed
System.
All multi-machine systems are not Distributed Systems.
“It is the software not the hardware that determines whether a system is
distributed or not” - Tanenbaum
Distributed OS vs. Network OS.
¾ User is not aware of the multiple ¾ User is aware of the existence of

CPUs. multiple CPUs.
¾ Each machine runs a part of the ¾ Each machine has its own private
Distributed Operating System. Operating System.
¾The system is fault-tolerant. ¾ The system is not fault-tolerant.

Why Distributed Operating Systems ?
Price/Performance advantage (Availability of cheap and powerful

Microprocessors).
Incremental growth.
Reliability and Availability.
Simplicity of Software (Theoretically).
Provides Transparency.
Creates another level of abstraction (e.g. Process creation).

Problems with Distributed Operating System
Communication Protocol Overhead.
Lack of Simplicity.
High requirement of the degree of fault tolerance.
Lack of global state information (e.g. No global Process Tables).
Atomic Transactions.
Process and Data Migration (e.g. During Load Balancing and Paging
respectively).
Distributed Operating System Models
Minicomputer Model
¾ It consists of a few minicomputers each with multiple users.
¾ Simple outgrowth of the Central Time-Sharing Systems.
¾ Each user is locally logged-on to one machine and remotely logged-on to other machines.
¾ (Logged-in Users / Available CPUs) < 1
Workstation Model
¾ Each user has his personal workstation and nearly all work is done on the workstation.
¾ Each user is locally logged-on to one machine and remotely logged-on to other machines.
¾ It supports single, global file-system that provides location-independent data access.
¾ (Logged-in Users / Available CPUs) ~ 1
Processor Pool Model

¾ When an user needs to perform computation, a processor is allocated from the processor
pool to the user task.
¾ (Logged-in Users / Available CPUs) > 1
Design Issues
Communication Primitives
Naming and Protection
Resource Management
Fault-Tolerance
Services
Message Passing
Client
sends
Server
request
receives
message
request
message
Client-Server Model of Communication
Types of Message Passing Primitives
¾ Blocking versus Non-Blocking Primitives
¾ Buffered versus Unbuffered Primitives

Remote Procedure Call (RPC)
¾ The idea is to make the semantics of Inter-machine communication as similar to normal

machine calls.
¾ RPC Design Issues:
¾ Parameter Passing: Passing reference parameters over the network is not easy. A unique system-
wide pointer for each object is needed to access it remotely.
¾ Parameter Representation: Incompatible representation of data across network. Conversion to and

from a standard format is expensive and wasteful when both the receiver and sender use the same
formats.
¾ Client-Server Binding: Sometimes it is important to know the details of the servers while handling
RPC calls (Multiple File Server systems). Its difficult to achieve this functionality.
Naming and Protection
OS support a large number of objects like files, directories, segments,
mailboxes, processes, services, servers, nodes and I/O devices.
Required for Object Recognition.
Naming as Mapping
¾ Problem of mapping between two domains.
Name Servers
¾ Maintain a table or database of the name-to-

object mapping.
¾ Services, processes, etc need to register with

the underlying naming system.
¾ Name Server Models:
o Centralized Name Server Model: A single server accepts names in one domain and maps them to
names in another domain.
o Distributed Name Lookup Model: Partition the system into domains with each domain having its own
naming server.
Resource Management
Managing resources without having accurate global state information is
difficult.
Distributed OS do not have tables that provide up-to-date status information

of all the resources being managed.
Considerations:
¾ Processor Allocation
¾ Scheduling
¾ Load balancing
¾ Distributed Deadlock Detection

Processor Allocation
Processors are organized in a logical hierarchy independent of the physical
structure of the network (MICROS).
¾ Each manager has an idea about the free processors possessed by it.
¾ If it has enough number of free processors for a request then it allocates them
otherwise forwards the request to his immediate boss.
Scheduling
In presence of multiple processors, a way is needed to ensure that processes that
communicate frequently run simultaneously so that they can be scheduled together in
a group to run on different processors.
It is difficult to dynamically determine the inter-process communication (IPC) patterns.
Ousterhout has proposed several algorithms based on the concept of Coscheduling,

which takes IPC patterns into account while scheduling to ensure that all members of
a group run at the same time.
One idea is to have each processor use a round-robin scheduling algorithm and
schedule all processes that communicate with each other on different processors in
the same slot, to achieve N-fold parallelism.
The disadvantage of this approach is the high overhead incurred for performing IPC
between processes of a group that run on different processors over the network.
To avoid high cost of IPC over the network, the closely related groups of processes
should be scheduled on the same processor.
Load balancing
In order to avoid one processor from being heavily loaded, load balancing is
required.
Techniques:
Graph-theoretic Model:
¾ Requires the CPU and memory requirements of each process and the average of traffic
between each pair of processes to be known in advance.
¾ System can be represented as a graph with each process as a node and each pair of
communicating process represented by an arc.
¾ The problem of allocating all the processes to k processors reduces to the problem of
partitioning the graph into k disjoint subgraphs.
¾ Drawback: This model is only of theoretic importance as none of the assumptions are
known in advance.
Heuristic Load Balancing:

¾ Each processor estimates its own load continuously, processors exchange load
information and this information is used for process creation and migration.
Practical Considerations of load balancing (How to do process migration?).

Fault-Tolerance
A fault tolerant system is the one that can continue functioning, perhaps in a
degraded form, even if something goes wrong.
One of the advantages of Distributed Operating Systems is that there are

enough resources to achieve fault tolerance.
Two radically different approaches:
¾ Redundancy Techniques
¾ Atomic Transactions
Redundancy Techniques
Redundancy through backup Redundancy through recorder

process. process.
¾ Provides every process with a backup ¾ A special recorder process records all
process on different processor. messages sent on the network.
¾ All messages sent to a process are also ¾ Every process checkpoints itself onto a
sent to the backup process. remote disk periodically.
¾ If one process crashes, the other can ¾ On a crash the process is started on an
clone itself to make a new backup and idle processor from the most recent
continue. checkpoint. The recorder process sends it
all the messages the original process
received between the checkpoint and the
crash.
Atomic Transactions
The property to run-to completion or do nothing is called an atomic

update.
A technique for achieving Atomic Transactions proposed by

Lampson is it Building up an Hierarchy of Abstractions.
¾ It makes use of abstraction layers such as careful disk, stable storage

and stable processors to implement multicomputer atomic transactions.
How to implement Mutual Exclusion ?

¾ When 2 processes on different CPUs try to access shared memory
using remote semaphores.
¾ Network becomes the bottleneck.
Services
In a Distributed Operating system, it is useful to have user level server
processes to provide functions that have been traditionally provided by the
operating system leading to the microkernel approach of the operating
system design.
Server Structure (Single-threaded or Multi-threaded).

File Service (disk, flat file & directory services).
Print Service.
Process Service (Remote process creation and caching of servers
possible).
Terminal Service.
Mail service.
Time Service.
Boot Service.
Gateway Service.
Comparison of some Distributed Operating Systems
Cambridge Amoeba V Kernel Eden
Project
Developed By Computing Tanenbaum@ David Cheriton@ University of
Laboratory@ Vrije Universiteit- Stanford Washington-
Univ. of Amsterdam University Seattle
Cambridge
Communication RPC RPC RPC RPC
Primitives
Naming and Single Name Sparse Three-level Capabilities
Protection Server capabilities with naming without protection
encryption mechanism
Resources Processor Bank Processor Pool Workstation Workstation

Model Model
Fault tolerance Small server to Some fault No fault tolerance Uses Recorder
startup services tolerance through process.
boot server
File Server Universal file Several file Similar to Unix No file server.
service and Filing services. One process for
Machine each file
Conclusion
Distributed systems are interesting and fruitful area of research for the
future.
They advocate the use of Microkernel approach to Operating Systems
Design.
Latest Research:
¾ Plan 9 @ Bell Labs

¾ 2K @ UIUC
¾ Inferno @ Vita Nuova
¾ The Sprite OS @ Berkeley
¾ Mach @ CMU
¾ AgentOS @ UCI
¾ WebOS @ Berkeley

Distributed Operating Systems

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Distributed Operating Systems

Caricato da

Copyright:

Formati disponibili

Distributed Operating

 Why Distributed Operating Systems ?

 Problems with Distributed Operating Systems

 Distributed Operating System Models

 Comparison of some Distributed Operating Systems

 A Distributed Operating System is the one that runs on multiple,

 Distributed Operating Systems provide resource transparency to the user

 The Distributed Operating System is unique and resides on different CPUs.

 All multi-machine systems are not Distributed Systems.

¾ User is not aware of the multiple ¾ User is aware of the existence of

¾The system is fault-tolerant. ¾ The system is not fault-tolerant.

 Price/Performance advantage (Availability of cheap and powerful

 Reliability and Availability.

 Simplicity of Software (Theoretically).

 Creates another level of abstraction (e.g. Process creation).

 Communication Protocol Overhead.

 High requirement of the degree of fault tolerance.

 Lack of global state information (e.g. No global Process Tables).

 Processor Pool Model

 Naming and Protection

Client-Server Model of Communication

 Types of Message Passing Primitives

¾ Blocking versus Non-Blocking Primitives

¾ Buffered versus Unbuffered Primitives

¾ The idea is to make the semantics of Inter-machine communication as similar to normal

¾ RPC Design Issues:

¾ Parameter Representation: Incompatible representation of data across network. Conversion to and

¾ Maintain a table or database of the name-to-

¾ Services, processes, etc need to register with

¾ Name Server Models:

 Distributed OS do not have tables that provide up-to-date status information

¾ Distributed Deadlock Detection

 It is difficult to dynamically determine the inter-process communication (IPC) patterns.

 Ousterhout has proposed several algorithms based on the concept of Coscheduling,

 Heuristic Load Balancing:

 Practical Considerations of load balancing (How to do process migration?).

 One of the advantages of Distributed Operating Systems is that there are

 Two radically different approaches:

 Redundancy through backup  Redundancy through recorder

 The property to run-to completion or do nothing is called an atomic

 A technique for achieving Atomic Transactions proposed by

¾ It makes use of abstraction layers such as careful disk, stable storage

 How to implement Mutual Exclusion ?

 Server Structure (Single-threaded or Multi-threaded).

Resources Processor Bank Processor Pool Workstation Workstation

¾ Plan 9 @ Bell Labs

Potrebbero piacerti anche

Why Distributed Operating Systems ?

Problems with Distributed Operating Systems

Distributed Operating System Models

Comparison of some Distributed Operating Systems

A Distributed Operating System is the one that runs on multiple,

Distributed Operating Systems provide resource transparency to the user

The Distributed Operating System is unique and resides on different CPUs.

All multi-machine systems are not Distributed Systems.

Price/Performance advantage (Availability of cheap and powerful

Reliability and Availability.

Simplicity of Software (Theoretically).

Creates another level of abstraction (e.g. Process creation).

Communication Protocol Overhead.

High requirement of the degree of fault tolerance.

Lack of global state information (e.g. No global Process Tables).

Processor Pool Model

Naming and Protection

Types of Message Passing Primitives

Distributed OS do not have tables that provide up-to-date status information

It is difficult to dynamically determine the inter-process communication (IPC) patterns.

Ousterhout has proposed several algorithms based on the concept of Coscheduling,

Heuristic Load Balancing:

Practical Considerations of load balancing (How to do process migration?).

One of the advantages of Distributed Operating Systems is that there are

Two radically different approaches:

Redundancy through backup Redundancy through recorder

The property to run-to completion or do nothing is called an atomic

A technique for achieving Atomic Transactions proposed by

How to implement Mutual Exclusion ?

Server Structure (Single-threaded or Multi-threaded).