Sei sulla pagina 1di 65

Parallel and Distributed Computing

Page 1




COMPUTER LABORATORY MANUAL







Parallel and Distributed Computing
(CS 332)
Spring Semester












DEPARTMENT OF COMPUTER SOFTWARE ENGINEERING
Military College of Signals
National University of Sciences and Technology
www.mcs.nust.edu.pk

Parallel and Distributed Computing
Page 2

PREFACE
This lab manual has been prepared to facilitate the students of software engineering in studying and
analysing various components of a compiler. The compiler is software that converts high level code
to low level machine code or in other words converts a program file to an executable one. The stages
of the compiler are scanning, parsing, semantic analysis and code generation. The lab sessions are
designed to improve the abilities of the students by giving hands on experience. Tools and languages
are given with each lab session.

PREPARED BY
Lab manual is prepared by Dr. Hammad Afzal using the material from Lab Manuals prepared by Dr.
Faisal Bashir and Lab Engr Fazalullah. The first two labs are re-produced using the Lab Manuals of
Object Oriented Programming (prepared by Dr. Hammad Afzal and Lab Engr Umer Mehmood). The
whole manual is created under the general supervision of Head of Department Dr. Naveed Iqbal Rao
in year 2014.

GENERAL INSTRUCTIONS
a. Students are required to maintain the lab manual with them till the end of the semester.
b. All readings, answers to questions and illustrations must be solved on the place provided. If more
space is required then additional sheets may be attached.
c. It is the responsibility of the student to have the manual graded before deadlines as given by the
instructor
d. Loss of manual will result in re submission of the complete manual.
e. Students are required to go through the experiment before coming to the lab session. Lab session
details will be given in training schedule.
f. Students must bring the manual in each lab.
g. Keep the manual neat clean and presentable.
h. Plagiarism is strictly forbidden. No credit will be given if a lab session is plagiarised and no re
submission will be entertained.
i. Marks will be deducted for late submission.
j. Error handling in a program is the responsibility of the Student.

VERSION HISTORY
Date Update By Details
Jan, 2014 Dr. Hammad
Afzal
First Version Created











Parallel and Distributed Computing
Page 3





MARKS

Exp
#
Date
Conducted
Experiment Title Max.
Marks
Marks
Obtained
Instructor
Sign
1
2
3
4
5
6
7
8
9
10
11
12
13
14



Grand Total


Parallel and Distributed Computing
Page 4

List of Experiments

Lab 1. Socket Programming using TCP----------------------------------------------------- Page 5
Lab 2. Socket Programming using UDP ---------------------------------------------------- Page 10
Lab 3. Practicing Client Server Applications----------------------------------------------- Page 14
Lab 4. Concurrency and Threads in Java--------------------------------------------------- Page 16
Lab 5. Advanced Threads in Java----------------------------------------------------------- Page 22
Lab 6. Concurrency with Semaphores ----------------------------------------------------- Page 26
Lab 7. JAVA RMI------------------------------------------------------------------------------ Page 31
Lab 8. Design and Develop a Remote Method Invocation API --------------------------Page 38
Lab 9. XML Document Validation----------------------------------------------------------- Page 41
Lab 10.Web Services-------------------------------------------------------------------------- Page 48
Lab 11. Development of Ontologies using Protg-----------------------------------------Page 51
Lab 12. Development of Ontologies using Protg -II-------------------------------------Page 55
Lab 13. Distributed Databases and Map-Reduce-I----------------------------------------- Page 59
Lab 14. Distributed Databases and Map-Reduce-II---------------------------------------- Page 62



Parallel and Distributed Computing
Page 5

LAB 1: Socket Programming - TCP
Objective
To demonstrate how connection oriented sockets (TCP) are created and used. Moreover,
ServerSocket class, its methods and implementation of a multithreaded server is explained.

Tools
Programming Language: Java
Operating System: Ubuntu

Theory.
Designing the solution



Get socket info of remote web servers to which //your client is attached.

Parallel and Distributed Computing
Page 6



Task

1. Try entering No argument to main method: Which exception is thrown? [2]



2. Modify the above code so that user is prompted on GUI based dialog box to enter the IP
address. Write the additions here [2]



import java.net.*;
import java.io.*;
public class getSocketInfo {

public static void main(String[] args) {

for (int i = 0; i < args.length; i++) {
try {
Socket theSocket = new Socket(args[i], 80);
System.out.println("Connected to " + theSocket.getInetAddress()
+ " on port " + theSocket.getPort() + " from port "
+ theSocket.getLocalPort() + " of " +
theSocket.getLocalAddress());
} // end try
catch (UnknownHostException e) {
System.err.println("I can't find " + args[i]);
}
catch (SocketException e) {
System.err.println("Could not connect to " + args[i]);
}
catch (IOException e) {
System.err.println(e);
}

} // end for

} // end main

} // end getSocketInfo

end main ; Mark the end of the source file
Parallel and Distributed Computing
Page 7

Sample TCP Client


import java.net.*;
import java.io.*;
public class TCPClient {
public static void main (String args[]) {
// arguments supply message and hostname of destination
Socket s = null;
try{
int serverPort = 7896;
s = new Socket(args[1], serverPort);
DataInputStream in = new DataInputStream ( s.getInputStream());
DataOutputStream out =new DataOutputStream( s.getOutputStream());
out.writeUTF(args[0]); // UTF is a string encoding
String data = in.readUTF();
System.out.println("Received: "+ data) ;
}catch (UnknownHostException e){
System.out.println ("Sock:"+e.getMessage());
}catch (EOFException e){System.out.println("EOF:"+e.getMessage());
}catch (IOException e){System.out.println("IO:"+e.getMessage());}
finally {if(s!=null) try {s.close();}catch (IOException
e){System.out.println("close:"+e.getMessage());}}
} }

Task
3. Create a sample TCPClient as shown above.
4. How many arguments do you need to give as input to main in order to have it run
successfully? What are those arguments? [2]





5. Modify the above client so that it reads a text file and send it to server. Write down only those
line of codes that you need to change. [4]

Parallel and Distributed Computing
Page 8

import java.net.*;
import java.io.*;
public class TCPServer {
public static void main (String args[]) {
try{
int serverPort = 7896;
ServerSocket listenSocket = new ServerSocket (serverPort);
while(true) {
Socket clientSocket = listenSocket.accept ();
Connection c = new Connection(clientSocket);
}
} catch(IOException e) {System.out.println("Listen :"+e.getMessage());}
}
}

class Connection extends Thread {
DataInputStream in;
DataOutputStream out;
Socket clientSocket;
public Connection (Socket aClientSocket) {
try {
clientSocket = aClientSocket;
in = new DataInputStream( clientSocket.getInputStream());
out =new DataOutputStream( clientSocket.getOutputStream());
this.start();
} catch(IOException e)
{System.out.println("Connection:"+e.getMessage());}
}
public void run(){
try { // an echo server
String data = in.readUTF();
out.writeUTF(data);
} catch(EOFException e) {System.out.println("EOF:"+e.getMessage());
} catch(IOException e) {System.out.println("IO:"+e.getMessage());}
finally{ try {clientSocket.close();}catch (IOException e){/*close
failed*/}}
}
}


Parallel and Distributed Computing
Page 9

Task
6. What is the purpose of creating new connection in a new thread ? What will happen if new
thread is not used for Connection? [3]







7. Create an EchoServer that echos the client message back to client i.e., the client sends a msgs
to the server and server simply reply back the same msg to the client. You are required to
create a multithreaded server using ServerSocket class. [Implementation] [7]

Web Resources

1. Transmission Control Protocol: http://en.wikipedia.org/wiki/Transmission_Control_Protocol
2. Sockets: http://docs.oracle.com/javase/tutorial/networking/sockets/
3. Echo Server: http://bansky.net/echotool/



Parallel and Distributed Computing
Page 10

LAB 2: Socket Programming - UDP
Objective
Implement concurrent echo client-server application using UDP Sockets.

Tools
Programming Language: Java
Operating System: Ubuntu

Theory.
Designing the solution



Parallel and Distributed Computing
Page 11

UDP Client

import java.io.*;
import java.net.*;

class UDPClient
{
public static void main(String args[]) throws Exception
{
BufferedReader inFromUser = new BufferedReader (new
InputStreamReader(System.in));
DatagramSocket clientSocket = new DatagramSocket ();
InetAddress IPAddress = InetAddress.getByName ("localhost");

byte[] sendData = new byte[1024];
byte[] receiveData = new byte[1024];

String sentence = inFromUser.readLine();
sendData = sentence.getBytes();

DatagramPacket sendPacket = new DatagramPacket(sendData, sendData.length,
IPAddress, 9876);

clientSocket.send(sendPacket);
DatagramPacket receivePacket = new DatagramPacket(receiveData,
receiveData.length);
clientSocket.receive(receivePacket);
String modifiedSentence = new String(receivePacket.getData());
System.out.println("FROM SERVER:" + modifiedSentence);
clientSocket.close();
}
}


Task
1. What is the difference between DatagramSocket and Socket? [2]





Parallel and Distributed Computing
Page 12




2. Why do we not need to have DatagramSocket fixed with an IP address at time of creation?
[2]








3. Modify the above program so that it gets System time before sending the packet to server. It
adds the timestamp along with the message. Write your added functionality here. [4]








UDP Server
import java.io.*;
import java.net.*;

class UDPServer
{
public static void main(String args[]) throws Exception
{
DatagramSocket serverSocket = new DatagramSocket (9876);
byte[] receiveData = new byte[1024];
byte[] sendData = new byte[1024];
while(true)
{
DatagramPacket receivePacket = new DatagramPacket (receiveData,
receiveData.length);
serverSocket.receive (receivePacket);
String sentence = new String (receivePacket.getData());
System.out.println ("RECEIVED: " + sentence);
InetAddress IPAddress = receivePacket.getAddress();
int port = receivePacket.getPort();
Parallel and Distributed Computing
Page 13

String capitalizedSentence = sentence.toUpperCase();
sendData = capitalizedSentence.getBytes();
DatagramPacket sendPacket =
new DatagramPacket(sendData, sendData.length, IPAddress, port);
serverSocket.send(sendPacket);
}
}
}



Task
4. Write down the above code and execute it.
5. Modify the code so that it reads the timestamp from the message sent by client. It calculates
the current system time and find the time taken by packet to be transferred. Write down the
lines of code you need to modify. [4]









6. Calculate bandwidth (Data sent per unit time) using the calculations in Task 4 [2]






7. Create a Server that provides Time Service at any port (e.g., 5099). The clients using Telnet
facility should be able to access the current time from the server.[Implementation] [6]


Web Resources

1. UDP: http://en.wikipedia.org/wiki/User_Datagram_Protocol
2. Networks and Sockets: http://docs.oracle.com/javase/tutorial/networking/sockets/
3. Time Server: http://www.worldtimeserver.com/


Parallel and Distributed Computing
Page 14

LAB 3: Practicing Client Server Applications
Objective
To practice and implement the concepts learnt in Labs about TCP/UDP Socket Programming.
Tools
Programming Language: Java
Operating System: Ubuntu
Theory.

Task

1. Write a program that searches ports between user given range on localhost to check for
open ports.
[5]
Task
2. Create a server that provides factorial of any number to its client. The client connects to the
server and then asks for number from the user. The Server provides a implementation and
calculates the factorial and returns the result to the client.
[5]





Parallel and Distributed Computing
Page 15

Task: Chat Application

3. Develop a GUI based chat software with the following features. [3]

4. The user should be able to select an avatar for him/her. [1]

5. The server should maintain list of users that are logged in. [3]

6. The user should be able to send message to other logged in users. [3]

Web Resources

1. Transmission Control Protocol: http://en.wikipedia.org/wiki/Transmission_Control_Protocol
2. Networking and Sockets: http://docs.oracle.com/javase/tutorial/networking/sockets/
3. UDP: http://en.wikipedia.org/wiki/User_Datagram_Protocol
4. Port Scanner: http://en.wikipedia.org/wiki/Port_scanner


Parallel and Distributed Computing
Page 16

LAB 4: Concurrency and Threads in Java
Objective
Implement concurrent echo client-server application using TCP Sockets.
Tools
Java Network Programming.
Operating System: Ubuntu
Theory.

Concurrency
Streaming audio application must simultaneously read the digital audio off the network, decompress
it, manage playback, and update its display. Software that can do such things is known as concurrent
software
The Java platform is designed from the ground up to support concurrent programming, with basic
concurrency support in the Java programming language and the Java class libraries. Basic
concurrency support and summarizes some of the high-level APIs in the java.util.concurrent packages.
In concurrent programming, there are two basic units of execution:
Processes
Threads
In the Java programming language, concurrent programming is mostly concerned with threads.
However, processes are also important.

Time Slicing
Processing time for a single core is shared among processes and threads
This sharing of time is performed through an OS feature called time slicing
Concurrency is possible even on simple systems, without multiple processors or execution
cores
IPC
To facilitate communication between processes, most operating systems support Inter Process
Communication (IPC) resources, such as pipes and sockets
IPC is used not just for communication between processes on the same system, but processes
on different systems.
Threads
Threads are sometimes called lightweight processes
Both processes and threads provide an execution environment, but creating a new thread
requires fewer resources than creating a new process
Threads exist within a process every process has at least one
Threads share the process's resources, including memory and open files.
Multithreaded execution is an essential feature of the Java platform
Every application has at least one thread or several
If you count "system" threads that do things like memory management and signal handling
But from the application programmer's point of view, you start with just one thread, called the
main thread.

Thread object
Each thread is associated with an instance of the class Thread
There are two basic strategies for using Thread objects to create a concurrent application
Parallel and Distributed Computing
Page 17

To directly control thread creation and management, simply instantiate Thread each
time the application needs to initiate an asynchronous task
To abstract thread management from the rest of your application, pass the application's
tasks to an executor.


Defining and Starting a Thread
An application that creates an instance of Thread must provide the code that will run in that
thread. There are two ways to do this:


Provide a Runnable object
The Runnable interface defines a single method, run, meant to contain the code
executed in the thread
The Runnable object is passed to the Thread constructor, as in the HelloRunnable
example:


Task
1. Write down the above code snippet. What is the output of the program? [2]










2. What happens when you rename the method run to myrun. Will the program still run?
Write down your observation. [2]

public class HelloRunnable implements Runnable
{
public void run()
{
System.out.println("Hello from a thread!" + getName());
}
public static void main(String args[])
{
(new Thread(new HelloRunnable())).start();
} }

end main ; Mark the end of the source file
Parallel and Distributed Computing
Page 18


Defining and Starting a Thread
Subclass Thread
The Thread class itself implements Runnable, though its run method does nothing
An application can subclass Thread, providing its own implementation of run, as in the
HelloThread example:



Counter.java :
A subclass of Thread that counts up to a limit with random pauses in between each count.

public class Counter extends Thread {
private static int totalNum = 0;
private int currentNum, loopLimit;

public Counter(int loopLimit) {
this.loopLimit = loopLimit;
currentNum = totalNum++;
}

private void pause(double seconds) {
try { Thread.sleep(Math.round(1000.0*seconds)); }
catch(InterruptedException ie) {}
}

/* When run finishes, the thread exits. */

public void run() {
for(int i=0; i<loopLimit; i++) {
System.out.println("Counter " + currentNum
+ ": " + i);
// pause(Math.random()); // Sleep for up to 1 second
public class HelloThread extends Thread
{
public void run()
{
System.out.println("Hello from a thread!");
}
public static void main(String args[]) {
(new HelloThread()).start();
} }

Parallel and Distributed Computing
Page 19

}
}
} /*

CounterTest.java : Instantiates Counter class and starts threads.


public class CounterTest {
public static void main(String[] args) {
Counter c1 = new Counter(5);
Counter c2 = new Counter(5);
Counter c3 = new Counter(5);
c1.start();
c2.start();
c3.start();
}
}


Task
3. Write and run the above code. What is the output you observe? [3]






Example: Another Example of Threads

public class ExampleThread extends Thread
{
private String name;
private String text;
private final int REPEATS = 5;
private final int DELAY = 200;

public ExampleThread( String aName, String aText )
{
name = aName;
text = aText;
}
public void run()
{
Parallel and Distributed Computing
Page 20

try
{
String threadName = Thread.currentThread().getName();
long threadID = Thread.currentThread().getId();
int threadPri = Thread.currentThread().getPriority();
String ThreadString = Thread.currentThread().toString(); //name,
priority and threadgroup
// Thread.currentThread().setPriority(MAX_PRIORITY);

for ( int i = 0; i < REPEATS; ++i )
{
System.out.println( name + " says \"" + text + "\" Thread Name:"
+ threadName + " ID: " + threadID + " Priority: " + threadPri + " " +
ThreadString );
Thread.sleep( DELAY );
}
}
catch( InterruptedException exception )
{
System.out.println( "An error occured in " + name );
}
finally
{
// Clean up, if necessary
System.out.println ( name + " is quiting..." );
}
}
}
public class ThreadTest
{
public static void main( String[] args )
{
ExampleThread et1 = new ExampleThread( "Thread #1", "Hello World!" );
ExampleThread et2 = new ExampleThread( "Thread #2", "Hey Earth!" );

// Thread t1 = new Thread( et1 );
// Thread t2 = new Thread( et2 );
// t1.start();
// t2.start();
//et1.setPriority(10);
et1.start();
Parallel and Distributed Computing
Page 21

et2.start();

// et1.interrupt();
//t1.interrupt();
}
}


Task
4. Write the output of ExampleThread Example. Explain the output in 3 sentences. [3]



5. Uncomment the code
// Thread.currentThread ().setPriority(MAX_PRIORITY);
Observe and comment on the output. [3]





6. Write a Counter class, similar to the one given in example, but it should implement the
interface Runnable rather extend the Thread Class. Implementation [3]

7. Uncomment the comments in code of ThreadTest. What are the differences you observe?
Comment.
[4]







Web Resources

1. Java Class Thread: http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/Thread.html
2. Concurrency: http://download.oracle.com/javase/tutorial/essential/concurrency/
3. http://download.oracle.com/javase/1.4.2/docs/api/java/lang/Thread.html
Parallel and Distributed Computing
Page 22

LAB 5: Advanced Threads in Java
Objective
To practice the advanced concepts of Threads in Java.
Tools
Java Network Programming.
Operating System: Ubuntu
Theory.

Synchronization
The Java programming language provides two basic synchronization idioms:
synchronized methods
synchronized statements
To make a method synchronized, simply add the synchronized keyword to its declaration



Example 1 shows how synchronized blocks can be used on objects to coordinate access to them by
multiple threads.

class Thread4 extends Thread
{
static String[] msg =
{
"Java", "is", "fast,", "dynamic,", "and", "comphrensive."
};
public Thread4(String id)
{
super(id);
}
public static void main(String[] args)
{
Thread4 t1 = new Thread4("t1: ");
Thread4 t2 = new Thread4("t2: ");
t1.start();
t2.start();
boolean t1IsAlive = true;
boolean t2IsAlive = true;
do
{
if (t1IsAlive && !t1.isAlive())
{
t1IsAlive = false;
System.out.println("t1 is dead.");
}
if (t2IsAlive && !t2.isAlive())
{
t2IsAlive = false;
public class SynchronizedCounter {
private int c = 0;
public synchronized void increment() { c++; }
public synchronized void decrement() { c--; }
public synchronized int value() { return c; } }

Parallel and Distributed Computing
Page 23

System.out.println("t2 is dead.");
}
}
while (t1IsAlive || t2IsAlive);
}
void randomWait()
{
try
{
Thread.currentThread().sleep((long) (3000 * Math.random()));
}
catch (InterruptedException e)
{
System.out.println("Interrupted!");
}
}
public void run()
{
synchronized (System.out)
{
for (int i = 0; i < msg.length; i++)
{
randomWait();
System.out.println(getName() + msg[i]);
}
}
}
}


Example 2 shows how synchronized methods and object locks are used to coordinate access to a
common object by multiple threads.


class Thread3 extends Thread {
static String[] msg = { "Java", "is", "fast,", "dynamic,",
"and", "comphrensive." };

public static void main(String[] args) {
Thread3 t1 = new Thread3("t1: ");
Thread3 t2 = new Thread3("t2: ");

t1.start();
t2.start();

boolean t1IsAlive = true;
boolean t2IsAlive = true;

do {
if(t1IsAlive && !t1.isAlive()) {
t1IsAlive = false;
System.out.println("t1 is dead.");
}

if(t2IsAlive && !t2.isAlive()) {
t2IsAlive = false;
System.out.println("t2 is dead.");
}
} while (t1IsAlive || t2IsAlive);

Parallel and Distributed Computing
Page 24

}
public Thread3(String id)
{
super(id);
}

void randomWait() {
try {
Thread.currentThread().sleep((long)(3000*Math.random()));
} catch(InterruptedException e) {
System.out.println("Interrupted!");
}
}

public void run() {
SynchronizedOutput.displayList(getName(), msg);//thread name t1 or t2
}
}

class SynchronizedOutput {

// if the 'synchronized' keyword is removed, the message
// is displayed in interleaved fashion
public static void displayList(String name,
String list[] )
{
for(int i=0; i<list.length; i++) {
Thread3 t = (Thread3) Thread.currentThread();
t.randomWait();
System.out.println(name + list[i]);
}
}
}

Tasks

1. What is the effect of removing the keyword synchronize in example 1. [2]





2. Implement the Parallelized version of Producer Consumer Problem. The buffer size should be
fixed (more than 1). [Implementation] [7]

3. Implement the Parallelized version of Producer Consumer Problem. The buffer size should be
exactly 1. [Implementation] [5]

Parallel and Distributed Computing
Page 25

4. Implement a Bank Account System which should give functionality to desposit and withdraw.
Both functions should be idempotent and synchronized. [Implementation] [6]






Web Resources

1. Java Class Thread: http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/Thread.html
2. Concurrency: http://download.oracle.com/javase/tutorial/essential/concurrency/
3. http://download.oracle.com/javase/1.4.2/docs/api/java/lang/Thread.html
4. http://www.javabeginner.com/learn-java/java-threads-tutorial
5. http://en.wikipedia.org/wiki/Synchronization
6. http://download.oracle.com/javase/tutorial/essential/concurrency/sync.html

Parallel and Distributed Computing
Page 26

LAB 6: Concurrency with Semaphores
Objective
We shall learn Java Semaphores and apply them in various practical problems.
Tools
Java Network Programming.
Operating System: Ubuntu

THEORY
Critical Section is a segment of code that only one thread at a time is allowed access. For example, a
critical section might manipulate a particular data structure or use some resource that supports at most
one client at a time. By placing a lock around this section, you exclude other threads from making
changes that might affect the correctness of your code. Locks (such as Mutex) are used to protect the
critical section.
Semaphores:
Semaphores are used for mutual exclusion and thread synchronization. Instead of busy waiting and
wasting CPU cycles, a thread can block on a semaphore (the operating system removes the thread
from the CPU scheduling or ``ready'' queue) if it must wait to enter its critical section or if the
resource it wants is not available.
Semaphores which allow an arbitrary resource count are called counting semaphores, while
semaphores which are restricted to the values 0 and 1 (or locked/unlocked, unavailable/ available) are
called binary semaphores.
Wait and Signals:
One important property of these semaphore variables is that their value cannot be changed except by
using the wait() and signal() functions.
Semaphores are operated by two operations, historically denoted as V (also known as signal()) and P
(or wait()). Operation V increments the semaphore S and operation P decrements it.
A simple way to understand wait() and signal() operations is:
wait(): Decrements the value of semaphore variable by 1. If the value becomes negative, the process
executing wait() is blocked, i.e., added to the semaphore's queue.
Parallel and Distributed Computing
Page 27

signal(): Increments the value of semaphore variable by 1. After the increment, if the pre-increment
value was negative (meaning there are processes waiting for a resource), it transfers a blocked process
from the semaphore's waiting queue to the ready queue.
java.lang.Object
java.util.concurrent.Semaphore

All Implemented Interfaces:
Serializable

public class Semaphore extends Object implements Serializable

Semaphores are available in Java as java.util.concurrent.Semaphore.
Constructor

public Semaphore(int permits)
Creates a Semaphore with the given number of permits and nonfair
fairness setting.

Method Details
The methods acquire() and release() are used here instead of P (Wait) and V(Signal).
Each acquire() blocks if necessary until a permit is available, and then takes it. Each release() adds a
permit, potentially releasing a blocking acquirer.

public void acquire() throws InterruptedException
Acquires a permit from this semaphore, blocking until one is available, or the thread is
interrupted.
Acquires a permit, if one is available and returns immediately, reducing the number of
available permits by one.
If no permit is available then the current thread becomes disabled for thread scheduling
purposes and lies dormant until one of two things happens.

public void release()
Releases a permit, returning it to the semaphore.
Parallel and Distributed Computing
Page 28

Releases a permit, increasing the number of available permits by one. If any threads are trying
to acquire a permit, then one is selected and given the permit that was just released. That thread
is (re)enabled for thread scheduling purposes.


Counting Semaphore Example in Java (Binary Semaphore)
Semaphore with one permit is known as binary semaphore because it has only two state permit
available or permit unavailable. Binary semaphore can be used to implement mutual exclusion or
critical section where only one thread is allowed to execute. Thread will wait on acquire() until
Thread inside critical section release permit by calling release() on semaphore.
Here is a simple example of counting semaphore in Java where we are using binary semaphore to
provide mutual exclusive access on critical section of code in java:
import java.util.concurrent.Semaphore;
public class SemaphoreTest {
Semaphore binary = new Semaphore(1);
public static void main(String args[]) {
final SemaphoreTest test = new SemaphoreTest();
new Thread(){
@Override
public void run(){
test.mutualExclusion();
}
}.start();

new Thread(){
@Override
public void run(){
test.mutualExclusion();
}
}.start();
}

private void mutualExclusion() {
try {
binary.acquire();

Parallel and Distributed Computing
Page 29

//mutual exclusive region
System.out.println(Thread.currentThread().getName() + " inside mutual
exclusive region");
Thread.sleep(1000);
} catch (InterruptedException i.e.) {
ie.printStackTrace();
} finally {
binary.release();
System.out.println(Thread.currentThread().getName() + " outside of
mutual exclusive region");
}
}
}

The Output of the above program is given below:
Thread-0 inside mutual exclusive region
Thread-0 outside of mutual exclusive region
Thread-1 inside mutual exclusive region
Thread-1 outside of mutual exclusive region

Tasks
1. Copy the program given in PROGRAMS (3) in your IDE and run. Observe the output and
write here [2]



2. In program in exercise 1. Remove the code
binary.acquire(); and
binary.release();
from the method
private void mutualExclusion()
Observe the output. Is the output same as in Exercise 1? If not, why? [3]




Parallel and Distributed Computing
Page 30




3. Show the trace of a simulation in Exercise 2, highlighting the following cases: - [5]
a. Normal execution.
b. Producer is blocked.
c. Consumer is blocked.





Major Task
4. Create your own class MySemaphore that should be able mato have functionality of binary
semaphore. [5]

5. MySemaphore should implement all methods as given in original Semaphore. [5]

Web Resources
1. http://www.javabeginner.com/learn-java/java-threads-tutorial
2. http://en.wikipedia.org/wiki/Synchronization
3. http://download.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/Semaphore.html

Parallel and Distributed Computing
Page 31

LAB 7: Java RMI
OBJECTIVE
How java RMI works
How RMI programs can be categorized
How RMI classes are compiled and executed

Theory

Remote Procedure Call
Birrell and Nelson (1984)
To allow programs to call procedures located on other machines.
Effectively removing the need for the Distributed Systems programmer to worry about all the details
of network programming (i.e. no more sockets).
It abstracts the communication interface to the level of a procedure call.
Instead of working directly with sockets, the programmer has the illusion of calling a local
procedure, when in fact the arguments of the call are packaged up and shipped off to the
remote target of the call.
RPC systems encode arguments and return values using an external data representation, such
as XDR.

Remote Method Invocation RMI is an extension of local method invocation that allows an object
living in one process to invoke the methods of an object living in another process.

Remote Objects
Objects that can receive remote method invocations are called remote objects and they
implement a remote interface.

Client Server Model
Client side: Send a request to server to execute a particular method of an object. A typical
client application gets a remote reference to one or more remote objects in the server and then
invokes methods on them.
Server: Objects define an interface which defines the methods of objects to be used. So with
interface it will be identified that method has been called properly or not. : A typical server
application creates a number of remote objects, makes references to those remote objects
accessible, and waits for clients to invoke methods on those remote objects.
Parallel and Distributed Computing
Page 32

Service and Remote Interface
Service Interface: In client server model, each server provides a certain set of procedures to
the clients.
Remote Interface: Specifies functions of an object accessible to the outside world. Can pass
objects as arguments & return object as results.
RMI provides the mechanism by which the server and the client communicate and pass
information back and forth.
Such an application is sometimes referred to as a distributed object application.



Distributed object applications need to:
Locate remote objects
Communicate with remote objects
Load class byte-codes for objects that are passed as parameters or return values


Parallel and Distributed Computing
Page 33

Task:
We will create a Calculator Service that will provide some basic arithmetic functions to its client. We
therefore, need to define an interface, its implementation, the server and the clients.

First Step:
Create Calculator Interface


Tasks:
1. Write the above code in IDE and compile it. [1]
2. What is the purpose of adding extends java.rmi.Remote. [2]







3. Will the code still compile if we remove throws java.rmi.RemoteException. Write your
observation. [2]





public interface Calculator extends java.rmi.Remote {
public long add(long a, long b)
throws java.rmi.RemoteException;

public long sub(long a, long b)
throws java.rmi.RemoteException;

public long mul(long a, long b)
throws java.rmi.RemoteException;

public long div(long a, long b)
throws java.rmi.RemoteException;
}

Parallel and Distributed Computing
Page 34






Tasks:

4. Write the above code in IDE and compile it. [1]

public class CalculatorImpl extends java.rmi.server.UnicastRemoteObject
implements Calculator {
// Implementations must have an
//explicit constructor
// in order to declare the
//RemoteException exception
public CalculatorImpl() throws java.rmi.RemoteException {
super();
}
public long add(long a, long b)
throws java.rmi.RemoteException {
return a + b;
}
public long sub(long a, long b)
throws java.rmi.RemoteException {
return a - b;
}
public long mul(long a, long b)
throws java.rmi.RemoteException {
return a * b;
}
}

Parallel and Distributed Computing
Page 35

5. What is the purpose of adding extends java.rmi.server.UnicastRemoteObject. [3]




6. The above code doesnt compile successfully. What are the errors? What are the corrections
do you need to make? Make the corrections and compile again. [4]














Calculator Client

import java.rmi.Naming;
import java.rmi.RemoteException;
import java.net.MalformedURLException;
import java.rmi.NotBoundException;

public class CalculatorClient {

public static void main(String[] args) {
try {
Calculator c = (Calculator)
Naming.lookup("rmi://localhost/CalculatorService");
System.out.println( c.sub(4, 3) );
System.out.println( c.add(4, 5) );
System.out.println( c.mul(3, 6) );
System.out.println( c.div(9, 3) );
}

catch (MalformedURLException malFurl) {
System.out.println();
System.out.println("Mal Formed URL");
System.out.println(malFurl);
Parallel and Distributed Computing
Page 36

}
catch (RemoteException re) {
System.out.println();
System.out.println("RemoteException");
System.out.println(re);
}
catch (NotBoundException nbe) {
System.out.println();
System.out.println("NotBoundException");
System.out.println(nbe);
}
catch (java.lang.ArithmeticException ae) {
System.out.println();
System.out.println("java.lang.ArithmeticException");
System.out.println(ae);
}
}
}

Tasks:
7. Write the above code in IDE and compile. [1]
8. Modify the code so that client and server processes run on different machines
[Implementation] [2]




Tasks
import java.rmi.Naming;
public class CalculatorServer
{
public static void main(String args[]) {
System.out.println("Calculator Server Running ...");
try {
Calculator c = new CalculatorImpl();
Naming.rebind("rmi://localhost:1099/CalculatorService", c);
} catch (Exception e) {
System.out.println("Trouble: " + e);
}
}
}

Parallel and Distributed Computing
Page 37


9. On the same lines as discussed in the calculator service example create a Power service. It will
provide two methods to its remote users: a power method and a square method. The service
should be registered to the RMI registry with POWER SERVICE name. [5]
Web Resources
1. http://download.oracle.com/javase/tutorial/rmi/index.html
2. http://www.oracle.com/technetwork/java/javase/tech/index-jsp-136424.html
3. http://www.eg.bucknell.edu/~cs379/DistributedSystems/rmi_tut.html



Parallel and Distributed Computing
Page 38

LAB 8: Major Assignment Design and Develop a Remote
Method Invocation API

Objective
Design and Develop a complete Remote Method Invocation API.
You should be able to create different modules of API by getting help from Java RMI.
(Note: You can submit assignment during the week but at least 60% should be completed in
Lab today)
Theory and Tasks
The RMI Software: layer of s/w b/w app-level objects & communication & remote reference
modules:
Proxy
Dispatcher
Skeleton
The complete functionality is depicted in Figure below.


Proxy
Role is to make RMI transparent to the clients by behaving like a local object to the invoker. But
instead of executing an invocation it forwards it in a message to the remote object.
Implementation of Remote Interface on Client Side
When client binds to a distributed system object an (virtual) implementation of the objects interface
called a proxy is loaded into the clients address space.
There is one proxy for every Remote Object for which a process holds the ROR.
Proxy implements them quite differently. Each method of proxy marshals a reference to
the target object, its own method id and its arguments into a request message and sends it
to the target. Then it waits for the reply message. Un-marshal it and returns the results to
the invoker.

Parallel and Distributed Computing
Page 39

Tasks
1. Create a class that depicts the functionality of datastructure message. It should have following
components. [3]



2. In order to make an object remote, you have to create Remote Object Reference (ROR) of the
object. Develop a functionality that creates the ROR for each object that is to be remotely
accessed. The format of ROR used in Java RMI is given below. [3]


Binder
Client programs require a mean of obtaining the remote object reference for at-least one of the remote
objects hosted by the server. Binder in a DS is a separate service that maintains a table containing
mappings from textual names to remote object references. An instance of it runs on every server
which holds the remote objects. Used by the servers to register their remote objects by name and by
the clients to look up the remote object references.
//computer name: port/object name
Tasks
3. Create a Naming Server (Binder) that can bind the services (serer side objects) with a Service
Name. [Implementation] [2]
4. The Naming server should provide the ability for lookup. [Implementation] [2]
5. Server side should be able to create Remote Object. [Implementation] [2]
Skeleton
Class of a remote object has a skeleton which implements the methods in the remote interface.
Incoming invocation messages are first passed to skeleton which un-marshals them to proper remote
method invocations at the objects interface at the server, i.e. it un-marshals the arguments in the
request message and invoke the corresponding method in the servant.
Parallel and Distributed Computing
Page 40

It waits for the reply. Then marshals the result together with any exceptions if any in a reply message
and send it back to the clients proxy.
Actual object resides on the server machine where it offers the same interface as it does on the client
machine.
Server has one dispatcher and skeleton for every remote object.
Dispatcher
Dispatcher receives a request message from the communication module.
It uses the method-id to select the appropriate method in the skeleton passing on the request message.

Tasks [Implementation]
6. Design and develop your own Skeleton and Dispatcher. [3]
7. Design and develop the communication modules. [3]
8. You should be able to simulate the functionality of Calculator Service (in previous lab) using
your own designed API. [4]

Web Resources
1. http://download.oracle.com/javase/tutorial/rmi/index.html
2. http://www.oracle.com/technetwork/java/javase/tech/index-jsp-136424.html
3. http://www.eg.bucknell.edu/~cs379/DistributedSystems/rmi_tut.html
Parallel and Distributed Computing
Page 41

LAB 9: XML document validation
Objective
To learn how to design XML Schema and XML instance document

Tools
GUI-IDE Tool NetBeans 6.0

Theory
XML validation is the process of checking a document written in XML (eXtensible Markup
Language) to confirm that it is both well-formed and also "valid" in that it follows a defined structure.
A well-formed document follows the basic syntactic rules of XML, which are the same for all XML
documents. A valid document also respects the rules dictated by a particular DTD or XML schema,
according to the application-specific choices for those particular .

An XML schema defines the structure of the elements and attributes in an XML document. For an
XML document to be valid based on an XML schema, the XML document has to be validated against
the XML schema.

In this article, JAXP parsers are used to validate an XML document with an XML schema. In JAXP,
DocumentBuilder classes are used to validate a XML document. XML schema validation is
illustrated with an XML document comprising of a catalog.

Preliminary Setup
To validate an XML document with the Xerces2-j parser, the Xerces2
To validate a XML document with the JAXP parser, its DocumentBuilder classes need to be in the
classpath.

Overview
In this tutorial, an example XML document named catalog.xml is used.
<?xml version="1.0" encoding="UTF-8"?>
<!--A OnJava Journal Catalog-->

<catalog
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation = "file://c:/Schemas/catalog.xsd"
title="OnJava.com" publisher="O'Reilly">
Parallel and Distributed Computing
Page 42

<journal date="April 2004">
<article>
<title>Declarative Programming in Java</title>
<author>Narayanan Jayaratchagan</author>
</article>
</journal>
<journal date="January 2004">
<article>
<title>Data Binding with XMLBeans</title>
<author>Daniel Steinberg</author>
</article>
</journal>
</catalog>

Task
1. Create an xml file as shown above. [1]
2. What does noNamespaceSchemaLocation mean? [2]






The example XML document is validated with an example XML schema file, catalog.xsd. The
elements in this schema document are in the XML schema namespace
of http://www.w3.org/2001/XMLSchema.

<?xml version="1.0" encoding="utf-8"?>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="catalog">
<xs:complexType>
<xs:sequence>
<xs:element ref="journal" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
<xs:attribute name="title" type="xs:string"/>
<xs:attribute name="publisher" type="xs:string"/>
</xs:complexType>
Parallel and Distributed Computing
Page 43

</xs:element>
<xs:element name="journal">
<xs:complexType>
<xs:sequence>
<xs:element ref="article" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
<xs:attribute name="date" type="xs:string"/>
</xs:complexType>
</xs:element>
<xs:element name="article">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string"/>
<xs:element ref="author" minOccurs="0"
maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="author" type="xs:string"/>
</xs:schema>

In the following sections, we'll discuss validation of the example XML document, catalog.xml, with
the example schema document,catalog.xsd.
Validation of an XML Document with the JAXP Parser
To begin, import the DocumentBuilderFactory and DocumentBuilder classes.
The DocumentBuilder class is used to obtain a org.w3c.dom.Document document from an XML
document, while the DocumentBuilderFactory class is used to obtain a DocumentBuilder parser.

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;

To validate with a DocumentBuilder parser, set
the System property javax.xml.parsers.DocumentBuilderFactory:

System.setProperty("javax.xml.parsers.DocumentBuilderFactory",
"org.apache.xerces.jaxp.DocumentBuilderFactoryImpl");

Next, you need to create a DocumentBuilderFactory.
Parallel and Distributed Computing
Page 44


DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

An instance of DocumentBuilderFactory is found by applying the following rules and taking the first
one that succeeds:
To parse a XML document with a namespace, set the setNamespaceAware() feature to true. By
default, thesetNamespaceAware() feature is set to false.

factory.setNamespaceAware(true);

Set the setValidating() feature of the DocumentBuilderFactory to true to make the parser a validating
parser. By default, the setValidating() feature is set to false.
factory.setValidating(true);

Set the schemaLanguage and schemaSource attributes of the DocumentBuilderFactory.
The schemaLanguage attribute specifies the schema language for validation.
The schemaSource attribute specifies the XML schema document to be used for validation.

factory.setAttribute (
"http://java.sun.com/xml/jaxp/properties/schemaLanguage",
"http://www.w3.org/2001/XMLSchema");
factory.setAttribute (
"http://java.sun.com/xml/jaxp/properties/schemaSource",
SchemaUrl);
Create a DocumentBuilder parser.

DocumentBuilder builder = factory.newDocumentBuilder ();
This returns a new DocumentBuilder, with the parameters configured in the DocumentBuilderFactory.
Create and register anErrorHandler with the parser.

Validator handler=new Validator();
builder.setErrorHandler(handler);

Parallel and Distributed Computing
Page 45

Validator is a class that extends the DefaultHandler class. The DefaultHandler class implements
the ErrorHandlerinterface. The Validator class is listed in the previous section. Parse the XML
document with the DocumentBuilder parser. The different parse methods are parse(InputStream
is), parse(File f), parse(InputSource is), parse(InputStream is,String systemId), and parse(String uri).

builder.parse (XmlDocumentUrl);

Validator, an ErrorHandler of the type DefaultHandler, registers errors generated by the validation.

Tasks
3. Design a schema for student list. A student has information such as name, semester,
roll no, email-ids, phone-nos, etc. [3]

Parallel and Distributed Computing
Page 46

4. Write an XML instance document for the designed schema given in above task. [7]

Parallel and Distributed Computing
Page 47

5. Validate this instance Document against the schema. [Implementation] [7]

Web Resources
1. http://www.onjava.com/pub/a/onjava/2004/09/15/schema-validation.html
2. http://www.w3schools.com/Schema/default.asp
3. http://www.w3.org/XML/Schema



Parallel and Distributed Computing
Page 48

LAB 10: Web service, WSDL based, from Java source

Objective
WSDL based: Implement ArithmeticService that implements add, and subtract operations
Tools
GUI-IDE Tool NetBeans 6.0

Theory.
Web service is a method of communications between two electronic devices over the World Wide
Web. It is a software function provided at a network address over the web with the service always on
as in the concept of utility computing.
The W3C defines a Web service as:
a software system designed to support interoperable machine-to-machine interaction over a
network. It has an interface described in a machine-processable format (specifically WSDL).
Other systems interact with the Web service in a manner prescribed by its description using
SOAP messages, typically conveyed using HTTP with an XML serialization in conjunction
with other Web-related standards.
The W3C also states:
We can identify two major classes of Web services:
REST-compliant Web services, in which the primary purpose of the service is to manipulate
XML representations of Web resources using a uniform set of stateless operations; and
Arbitrary Web services, in which the service may expose an arbitrary set of operations.

Web Service Architecture



Parallel and Distributed Computing
Page 49

The Web Services Description Language is an XML-based interface description language that is
used for describing the functionality offered by a web service. A WSDL description of a web service
(also referred to as a WSDL file) provides a machine-readable description of how the service can be
called, what parameters it expects, and what data structures it returns. It thus serves a purpose that
corresponds roughly to that of a method signature in a programming language.

WSDL is often used in combination with SOAP and an XML Schema to provide Web services over
the Internet. A client program connecting to a Web service can read the WSDL file to determine what
operations are available on the server. Any special datatypes used are embedded in the WSDL file in
the form of XML Schema. The client can then use SOAP to actually call one of the operations listed
in the WSDL file using for example XML over HTTP.
In this lab, we shall implement a Web Service and its Client.
Tasks [Implementation]
Creation of ArithmeticService Web Service [5]
1. Create a Project of type Web application. Give it name Arithmetic.
i) Right click on Project folder, select New, select a Web service. A dialog box will appear.
Specify Name of Web service (ArithmeticService), package name (websvc), and select
option Create Web service from scratch
ii) Java source file can be seen in Source view or Design view. From design view, you will be
able to add operations. While adding operations, you have to specify name of operation, return
type, names and types of input parameters.
2. Go in source view, and provide definition of Web service operations.

Creation of web service Client [Implementation] [5]
3. Create a new project of type Java application. Give it name ArithmeticClient.
i. Right click on project folder and select New Web service client.
ii. A dialog box will appear asking location of WSDL and client.
For WSDL specify
http://localhost:8080/Arithmetic/ArithmeticServiceService?WSDL
and for client specify
websvcclient in package option.
Make sure Style is JAX-WS.

Parallel and Distributed Computing
Page 50

4. Right click in source code of Main class. Select option Web service client resource
Call web service operation.
A new dialog box will appear asking for selecting name of operation. Select add
operation.

Major Task [Implementation]
5. On similar lines as for Arithmetic Service, design and implement a TrigonometricService that
implements sin, and cos operations. [5]

6. Create a Client Application that calls the service created in Step-5 [5]



Web Resources
1. http://en.wikipedia.org/wiki/Web_service
2. http://www.w3.org/TR/wsdl
3. http://www.w3.org/TR/ws-arch/

Parallel and Distributed Computing
Page 51

LAB 11: Development of Ontologies using Protg

Objective
1. Introduction to Protg
2. Development of Ontologies using Protege
Tools
Stanford Protg
Theory.
Protege
Free, open-source ontology editor and knowledge-base framework
Based on Java, is extensible, and provides a plug-and-play environment
Supported by a strong community of developers and academic, government and corporate
users
Pure OWL Framework
Supports both OWL1.1 and OWL 2.0
Direct connection with OWL Reasoners
Pellet
FaCT++
HermiT
Classes
Sets that contain individuals
Thing
Class representing the set containing all individuals
All classes are subclasses of Thing
Properties
Binary relations between two individuals (Object Property) or one individual and a
datatype (Datatype Property)
Individuals
Represent objects within the Ontology (members of classes)
Subclasses and Superclasses
A subclass is a subcollection of objects
For example,
The class of laptop computers forms a subcollection of the class containing all
(types of) computers.
In the same way the class of all (types of) computers is a superclass of the class
of laptop computers.

Parallel and Distributed Computing
Page 52



OWL does not use the Unique Name Assumption (UNA)
This means that different names may refer to the same individual
E.g. the names Matt and Matthew may refer to the same individual (or they may
not)
Cardinality restrictions rely on counting distinct individuals
Therefore it is important to specify that either Matt and Matthew are the same
individual, or that they are different individuals
OWL Classes are assumed to overlap
Individuals of a class A can also be individuals of class B
Therefore one cannot assume that an individual is not a member of a particular class
simply because it has not been asserted to be a member of that class
To separate a group of classes
One must make them disjoint from one another
If A is disjoint from B, then an individual of class A cannot also be an individual of
class B

Reasoners are programs that interpret the description logic of the ontology and are able to assist in
the structure of the ontology.
A class is a category/type/set of an individual within a domain;
Example:
Cat1 is an individual of class Animal.
A particular course such as CS101 would be an individual of class Course.







Parallel and Distributed Computing
Page 53

Task:
Using protg, add two classes Course and Student. [5]



Constructing Individuals
Creating an individual is a two step process. [5]
First, Create an Individual
Second, Specify type/class of the individual.
Create S1, S2 as two individuals of Student class
Create C1, C2 as two individuals of Course class


Parallel and Distributed Computing
Page 54

Using the Reasoner
In Protg, from menu, select Reasoner > start Reasoner
From the DL Query tab enter class expression queries into the query window and write the
results in table. [5]

Class expressions Result
Course
Student
Course and Student
Course or Student
not Course
not(not Course)


Web Resources
1. http://semanticweb.org/wiki/Main_Page
2. http://en.wikipedia.org/wiki/Semantic_Web
3. http://en.wikipedia.org/wiki/Resource_Description_Framework
4. http://protege.stanford.edu/

Parallel and Distributed Computing
Page 55


LAB 12: Using Protg Part II
Objective
1. Development of Ontologies using Protege
Tools
Stanford Protg

Theory.

Specifying Disjointness
Specify that the classes Student and Course are disjoint
Specifying Disjointness in Protege:
Proceed to the classes tab and select both the Course and the Student classes.
Now click the Disjoint Classes button to make the Student and the Course classes
disjoint.



Consistency checking
Test whether a class could have instances
Classification
A classifier takes a class hierarchy and places a class in the class hierarchy
Task of turning implicit definitions already present in the hierarchy as explicit
Selecting
Go to the reasoner menu and select Fact++ as your reasoner
Running
In the same menu, click Start Reasoner
or simply type Ctrl-R







Parallel and Distributed Computing
Page 56

Task:
Create a university Course Class Hierarchy [5]



Object Properties
Relationships between two individuals
Correspond to relationships in UML
For Example
Person1 hasFriend Person2
Datatype Properties
Relationships between an individual and data values
The term datatype is used to denote the type of a datum.
Correspond to attributes in UML
For Example
Person1 hasName Smith
Domain and Range
Properties link individuals from the domain to individuals or datatypes from the range
Characteristics
Specify the meaning of properties
Restrictions
Explained latter
Super Properties
Properties can be further refined as sub-properties inheriting the domain, range,
characteristics and restrictions
Parallel and Distributed Computing
Page 57



Task:
Creating Object Properties in a University Ontology [8]
Add an object property isTeacherOf that can be used to link course to professor
Similarly, create a property called isTaughtBy that can be used as an inverse link to link a
course to teacher of courese.



Creating Datatype Properties in a University Ontology [7]

Domain
Classes of Individuals
Range
XML Schema Datatype value (http://www.w3.org/TR/xmlschema-2/)
RDF literal
XML literal
Cannot have Inverted Properties
Parallel and Distributed Computing
Page 58

Web Resources
1. http://semanticweb.org/wiki/Main_Page
2. http://en.wikipedia.org/wiki/Semantic_Web
3. http://protege.stanford.edu/
4. http://en.wikipedia.org/wiki/Resource_Description_Framework


Parallel and Distributed Computing
Page 59

LAB 13: Distributed Databases-I
Objective
To learn and apply the advanced concepts of Distributed Databases
Tools
Netbeans IDE.
Operating System: Ubuntu
Theory.

Data
Data or information is the currency of virtual world. Changing trends in technology implies a tilt of
interest towards the digital storage and processing of data. Information that was processed or
managed by dozens of people is now handled by a single computer. Human hours are replaced with
the frequency hertz of a processing unit. In this era of digital information it is important to keep pace
with changes of technology, keeping your data safe and available is the dream which is about to
become true. Generally we define data as:
Facts and statistics collected together for reference or analysis
and a database is defined as:
A structure set that holds data and make it accessible via various ways
Software Applications
Applications are the computer programs that perform a unified task
For example: Opera (Web Browser), VLC (Audio Software), Adobe Photoshop (Graphic
Software)
Database Applications
Applications that stores and manages data are the database applications. A simple example of
database application is MySQL.
Web Applications
Web applications is a set of web pages hosted by a dedicated machine called server. Web
application runs in browsers.
Web browser is a desktop application while any chrome extension falls in the category of web
applications
Analytical Applications
Applications that measure the performance of business are called analytical applications.
These applications are used to produce analytical reports, sometimes used to create predictive
Parallel and Distributed Computing
Page 60

analysis; mostly estimating the trends and its ripple propagation in various business layers.
There are certain properties of analytical applications that differentiate them from the
traditional database applications.
1. Analytical queries are less predictable. For database applications the query structure depends on
the type of system and the interface that allows user to interact with database, but in case of
analytical applications query has to change dynamically as the variable in focus changes.
2. Analytical queries are mostly read oriented, they hardly write anything on the database but what
they do write are the analytical reports.
3. Analytical applications mostly focus on attributes instead of entities. Averages, aggregates,
maximum, minimum and all other implication relations are the analytical operations. All these
functions are called upon a single attribute of all entities.
Analytical Functions in SQL
Count, Min, Max, First, Last, Sum, Variance etc

Distributed Databases
A distributed database is a database in which storage devices are not all attached to a common
processing unit

Database Management tools
Software that manages and provides access mechanism for the database is called Data management
tool

Distributed Database Management tools
Software that manages and provides an access mechanism for the database that is spread over a
network is called distributed database management system


Parallel and Distributed Computing
Page 61

Structured Data
Structured data is the one that can be modeled and classified in the form of a data model or related
tables

Unstructured Data
Data that cannot be represented in the form of a model, pre-defined manner or relational table is
called unstructured data

Tasks

Create an advanced Search Engine-I [10]

1. Create a multi-threaded and distributed Web Crawler. That means,
a. A central module should start crawling the web and distribute the URLs to crawlers running
on other machines.
b. The central crawler should be able to manage the client crawler processes, recollect the results
(list of URLs and downloaded pages)
Create a (Cloud-like) Distributed File Storage [10]

2. You must be familiar with the Online File Storage/Sharing and Syncing systems like Google Drive or
DropBox. Create your own distributed file storage.
a. The interface should be graphical.
b. The user should be able to log on to the system and see the online files in his profile.
c. You must implement an authentication mechanism so that the user has access to only those
files he has rights to.
d. User should be able to share file with other users.
e. User should be able to sync file with its online copy (on simple menu based commands)

Web Resources
1. http://searchoracle.techtarget.com/definition/distributed-database
2. http://en.wikipedia.org/wiki/Distributed_database
3. http://en.wikipedia.org/wiki/Distributed_file_system
4. http://technet.microsoft.com/en-us/library/cc753479(v=WS.10).aspx

Parallel and Distributed Computing
Page 62

LAB 14: Distributed Databases and Map-Reduce
Objective
1. To learn and apply the advanced concepts of Distributed Databases and map-reduce
Tools
Netbeans IDE.
Operating Systems: Ubuntu
Theory.

Big Data
Data that grows beyond the process capability of a data management tool is called big data
Big data limit for 2012 was 2.5 exabytes (1 exabyte=10
18
bytes)

The famous 3Vs by Gartner Analyst.
Volume means the huge amount of data.
Velocity means the enormous rate of data generation.
Variety means the heterogeneous type of data.
These 3 Vs are the closest a man can get to understanding big data.

Map Reduce
Map-reduce is a most widely used algorithm when it comes to handling big data. This algorithm is
responsible for distributing work among the nodes and querying data. It works on the distributed data
sets structured and unstructured both (works more happily on unstructured data).
This algorithm has two key functions Map and Reduce. Map function finds the relevant data sets
location in the distributed environment and the reduce function applies the search criteria on the data.
Algorithm works exclusively on key-value pairs. Whatever input is given to the map function it
considers it as a key-value pair and produces a key-value pair as a result. The data type of the output
key-value pair can be different from the input key-value pair.
Parallel and Distributed Computing
Page 63


Figure Map Reduce
Figure 3 is a visual description of map-reduce. On the very left the stack of databases is the data
source, the map functions reads required data from the data store and prepares it for processing. Then
map results are combined and fed to the reduce function which then computes the final results and
writes them to the repository. It looks neat but it has a few flaws too.
A famous explanatory example of map-reduce is Word count. Lets say we have a file that has 2 lines
in it.
Hello World Bye World
Hello People Goodbye People
Figure Input to Map reduce
Map function reads the document sentence by sentence and breaks it into words. It counts the number
of times that word appeared in a sentence, fills the word-value pair (called the intermediate pair) and
returns it for the use of reduce function.
Now a map function m1 is given line 1 to process and map function m2 is given line 2 to process.
The results from the two map functions is
Map Function m1 Map Function m2
Input: Hello World Bye World Input: Hello People Goodbye People
Intermediate Pairs:
< Bye, 1 >
< Hello, 1 >
< World, 2 >
Intermediate Pairs:
< Goodbye , 1 >
< Hello, 1 >
< People, 2 >
Figure Input/Output Map Function
Parallel and Distributed Computing
Page 64

Reduce function gets intermediate pairs as input from a number of map functions (all returning word
count of different sentences). This function counts the word occurrence in full document and returns
final word-value pair (called output pairs).
Data from m1 and m2 is read by a reduce function R, which computes the final results as shown
below.

Reduce function R
Input: < Bye, 1 >, < Hello, 1 >, < World, 2 >, < Goodbye , 1 >, < Hello, 1 >,
< People, 2 >
Output Pairs:
< Bye, 1 >
< Hello, 2 >
< World, 2 >
< Goodbye , 1 >
< People, 2 >
Figure Input/Output Reduce Function
Map and reduce are the core functions of the algorithm, there are various other helping functions as
well like there is a function named combine It collects results from map functions working on a
similar process and passes it onto the reduce function. Function names vary from implementation to
implementation.

Project-Task

Map-Reduce [10]
1. Write a MapReduce Application which processes weather data.
a. List out the hottest years from the available data (for Islamic Capitals).
b. Use the weather data available from the internet or prepare it referring the input
discussed in the lecture.
c. Process it using a pseduo distribution mode on Hadoop platform.

Create an advanced Search Engine-II [10]
2. Create a multi-threaded and distributed Web Crawler. That means,
a. Index the URLs and the keywords. You MUST implement it using Ma-Reduce algorithm.
b. Create a Query Engine that should be able to get input from user and return the result.


Parallel and Distributed Computing
Page 65

Web Resources
1. http://hadoop.apache.org/
2 http://en.wikipedia.org/wiki/Hadoop
3. http://searchoracle.techtarget.com/definition/distributed-database
4. http://en.wikipedia.org/wiki/Distributed_database
5. http://en.wikipedia.org/wiki/Distributed_file_system
6. http://technet.microsoft.com/en-us/library/cc753479(v=WS.10).aspx

Potrebbero piacerti anche