Sei sulla pagina 1di 37

DISTRIBUTED

SYSTEMS AND
TECHNOLOGIES

Chapter 1 - Introduction
Introduction

Contents
• What is a distributed system?
• Difference with networked systems.
• Why distribution? Goals and challenges.
• Trends in distributed systems.
• Types of distributed systems.

2
Asrat M. (PhD)
What is a distributed system?
A distributed system is:
• A collection of independent computers that
appears to its users as a single coherent system.
• this definition has two major aspects:
• hardware: autonomous machines - type of computers and their
complexity
• software: a single system view for users - the
collaboration/communication among the machines

• Concurrency, Absence of Global Clock and Independent Failures.

3
Asrat M. (PhD)
What is a distributed system?

A distributed system is a system designed to support the


development of applications and services which can
exploit a physical architecture consisting of multiple,
autonomous processing elements that do not share
primary memory but cooperate by sending asynchronous
messages over a communication network.
(Blair & Stefani)

4
Asrat M. (PhD)
What is a distributed system?

Figure 1-1. A distributed system organized as middleware. The


middleware layer extends over multiple machines, and offers
each application the same interface. 5
Asrat M. (PhD)
Networked vs. Distributed Systems
• Computer Networks
• A computer network is an interconnected collection of
autonomous computers able to exchange information.
• A computer network usually require users to explicitly login
onto one machine, explicitly submit jobs remotely, explicitly
move files/data around the network.
• Distributed Systems
• The existence of multiple autonomous computers in
a computer network is transparent to the user.
• The operating system automatically allocates jobs to
processors, moves files among various computers
without explicit user intervention.
6
Asrat M. (PhD)
Figure 1-2. Shows selected application domains and associated networked
applications

Finance and commerce eCommerce e.g. Amazon and eBay, PayPal,


online banking and trading
The information society Web information and search engines, ebooks,
Wikipedia; social networking: Facebook and
MySpace.
Creative industries and Online gaming, music and film in the home,
entertainment user-generated content, e.g. YouTube, Flickr
Healthcare Health informatics, online patient records,
monitoring patients
Education e-learning, virtual learning environments;
distance learning
Transport and logistics GPS in route finding systems, map services:
Google Maps, Google Earth
Science The Grid as an enabling technology for
collaboration between scientists
Environmental management Sensor technology to monitor earthquakes,
floods or tsunamis 7
Asrat M. (PhD)
Why distribution? Goals.
• Resource and Data Sharing
• printers, databases, multimedia servers, ...
• Availability, Reliability
• the loss of some instances can be hidden
• Scalability, Extensibility
• the system grows with demand (e.g., extra servers)
• Performance
• huge power (CPU, memory, ...) is available
• Inherent distribution, communication
• organizational distribution, e-mail, video

8
Asrat M. (PhD)
Why Distribution? Challenges.
• Problems of Distribution
• Concurrency and Security
• clients must not disturb each other
• Privacy
• e.g., when building a preference profile such as using
cookies
• unwanted communication such as spam
• Partial failure
• we often do not know where the error is (e.g., RPC)
• Location, Migration, Relocation, Replication
• clients must be able to find their servers
• Heterogeneity
• hardware, platforms, languages, management
9
Asrat M. (PhD)
Characteristics and Goals of
Distributed Systems
• differences between the computers and the way they
communicate are hidden from users
• users and applications can interact with a distributed
system in a consistent and uniform way regardless of
location
• distributed systems should be easy to expand and scale
• a distributed system is normally continuously available,
even if there may be partial failures

• to support heterogeneous computers and networks


and to provide a single-system view, a distributed
system is often organized by means of a layer of
software called middleware that extends over
multiple machines 10
Asrat M. (PhD)
• a distributed system should
• easily connect users with resources (printers, computers,
storage facilities, data, files, Web pages, ...)
• Reasons
• economics: sharing resources such as printers and high-speed
computers
• to collaborate and exchange information
• groupware: software for collaborative editing, teleconferencing, etc.
• e-commerce: buying and selling goods
• be transparent: hide the fact that the resources and processes
are distributed across multiple computers
• be open (slide #13)
• be scalable
• Transparency in a Distributed System
• a distributed system that is able to present itself to users and
applications as if it were only a single computer system.
11
Asrat M. (PhD)
Transparency in a Distributed System

But trying to achieve all distribution transparency may be


impossible or may not be a good idea.
Figure 1-3. Different forms of transparency in a distributed
system (ISO, 1995).
12
Asrat M. (PhD)
• Openness in a Distributed System
• a distributed system should be open
• we need well-defined interfaces
• interoperability
• components of different origin can communicate
• portability
• components work on different platforms
• should be flexible and extensible;
• easy to configure the system out of different
components;
• easy to add new components, replace existing ones;
• an Open Distributed System is a system that offers services
according to standard rules that describe the syntax and
semantics of those services; e.g., protocols in networks

13
Asrat M. (PhD)
• in distributed systems, such services are usually specified
through interfaces, often described using an Interface
Definition Language(IDL) which specify only syntax: the
names of the functions, types of parameters, return values,
possible exceptions, ...
• semantics is given in an informal way by means of natural
languages
• Scalability in Distributed Systems
• a distributed system should be scalable; there are three
dimensions
• size: adding more users and resources to the system
• geography: users and resources may be far apart
• administration: should be easy to manage even if it
spans many administrative organizations
• but a scalable system may exhibit performance problems
• scalability problems leading to low performance???
14
Asrat M. (PhD)
• Scaling Techniques: how to solve scaling problems
• the problem is mainly performance, and arises as a
result of limitations in the capacity of servers and
networks (for geographical scalability with high
latency and mostly unreliable links)
• three possible solutions:
• hiding communication latencies,
• distribution, and
• replication

15
Asrat M. (PhD)
• Hide Communication Latencies
• try to avoid waiting for responses to remote service requests
let the requester do other useful job
• i.e., construct requesting applications that use only
asynchronous communication instead of synchronous
communication; when a reply arrives the application is
interrupted
• good for batch processing and parallel applications since
independent tasks can be scheduled while another task is
waiting for communication to complete or use
multithreading for non-parallel programs
• hiding communication latencies is not in general applicable
for interactive applications
• for interactive applications, try to reduce communication;
move part of the job to the client to reduce communication;
e.g. filling a form to access a database and checking the 16
entries Asrat M. (PhD)
• Distribution
• means splitting a component into smaller parts and
spreading those parts across the system
• e.g., DNS -Domain Name System (xxxx@cs.smu.edu.et)
• divide the name space into non overlapping zones
• for details, see later chapter covering – Naming
• Replication
• replicate components across a distributed system to increase
availability and for load balancing, leading to better
performance
• replication is decided by the owner of a resource caching (a
special form of replication) also reduces communication
latency; decided by the user
• but, caching and replication may lead to consistency
problems as discussed in detail in Consistency and
Replication.
17
Asrat M. (PhD)
Pitfalls when Developing
Distributed Systems
False assumptions made by first time developers:
• The network is reliable.
• The network is secure.
• The network is homogeneous.
• The topology does not change.
• Latency is zero.
• Bandwidth is infinite.
• Transport cost is zero.
• There is one administrator.
18
Asrat M. (PhD)
Trends in Distributed Systems
• Distributed systems are undergoing a period of significant
change and this can be traced back to a number of
influential trends:
• the emergence of pervasive networking technology;
• the emergence of ubiquitous computing coupled with the
desire to support user
• mobility in distributed systems;
• the increasing demand for multimedia services;
• the view of distributed systems as a utility.

19
Asrat M. (PhD)
Figure 1-4. A typical portion of the Internet

intranet
ISP

backbone

satellite link

desktop computer:
server:
network link:

Pervasive networking and the modern Internet20


Asrat M. (PhD)
Figure 1-5. Portable and handheld devices in a
distributed system

Mobile and ubiquitous computing


21
Asrat M. (PhD)
Figure 1-6. Cloud computing

Distributed computing as a utility 22


Asrat M. (PhD)
Examples of Distributed Systems
financial trading system
Cluster Computing Systems
Grid Computing Systems
Transaction Processing Systems
Enterprise systems and applications
Electronic Health Care Systems
Sensor Networks
etc

23
Asrat M. (PhD)
Figure 1-7. An example financial trading system

24
Asrat M. (PhD)
Cluster Computing Systems

Figure1-8. An example of a cluster computing system.


25
Asrat M. (PhD)
Grid Computing Systems

Figure 1-9. A layered architecture for grid computing systems.


26
Asrat M. (PhD)
Transaction Processing Systems (1)

Figure 1-10. Example primitives for transactions.


27
Asrat M. (PhD)
Transaction Processing Systems (2)

Characteristic properties of transactions:


• Atomic: To the outside world, the
transaction happens indivisibly.
• Consistent: The transaction does not violate
system invariants.
• Isolated: Concurrent transactions do not
interfere with each other.
• Durable: Once a transaction commits, the
changes are permanent.

28
Asrat M. (PhD)
Transaction Processing Systems (3)

Figure 1-11. A nested transaction.


29
Asrat M. (PhD)
Transaction Processing Systems (4)

Figure 1-12. The role of a TP monitor in distributed systems.


30
Asrat M. (PhD)
Enterprise Application Integration

Figure 1-13. Middleware as a communication facilitator in


enterprise application integration. 31
Asrat M. (PhD)
Electronic Health Care Systems (1)
Questions to be addressed for health care systems:
• Where and how should monitored data be
stored?
• How can we prevent loss of crucial data?
• What infrastructure is needed to generate and
propagate alerts?
• How can physicians provide online feedback?
• How can extreme robustness of the monitoring
system be realized?
• What are the security issues and how can the
proper policies be enforced?
32
Asrat M. (PhD)
Electronic Health Care Systems (2)

Figure 1-14. Monitoring a person in a pervasive electronic health


care system, using (a) a local hub or
(b) a continuous wireless connection.
33
Asrat M. (PhD)
Sensor Networks (1)

Questions concerning sensor networks:


• How do we (dynamically) set up an
efficient tree in a sensor network?
• How does aggregation of results take
place? Can it be controlled?
• What happens when network links fail?

34
Asrat M. (PhD)
Sensor Networks (2)

Figure 1-15. Organizing a sensor network database, while storing


and processing data (a) only at the operator’s site or …
35
Asrat M. (PhD)
Sensor Networks (3)

Figure 1-16. Organizing a sensor network database, while storing


and processing data … or (b) only at the sensors.
36
Asrat M. (PhD)
Exercises
• A user arrives at a railway station that they has never visited before, carrying a
PDA that is capable of wireless networking. Suggest how the user could be
provided with information about the local services and amenities at that station,
without entering the station’s name or attributes. What technical challenges must
be overcome?
• Compare and contrast cloud computing with more traditional client-server
computing?
• What is novel about cloud computing as a concept?
• A service is implemented by several servers. Explain why resources might be
transferred between them. Would it be satisfactory for clients to multicast all
requests to the group of servers as a way of achieving mobility transparency for
clients?
• Why is it sometimes so hard to hide the occurrence and recovery from failures in a
distributed system?
• Why is it not always a good idea to aim at implementing the highest degree of
transparency possible?
• What is an open distributed system and what benefits does openness provide?
• Explain what is meant by a virtual organization and give a hint on how such
organizations could be implemented. 37
Asrat M. (PhD)

Potrebbero piacerti anche