Sei sulla pagina 1di 46

th

CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Practical-1

Aim: Introduction to Cloud Computing.

Theory:
The term cloud has been used historically as a metaphor for the Internet. This usage was
originally derived from its common depiction in network diagrams as an outline of a cloud, used to
represent the transport of data across carrier backbones (which owned the cloud) to an endpoint location
on the other side of the cloud. This concept dates back as early as 1961, when Professor John McCarthy
suggested that computer time-sharing technology might lead to a future where computing power and
even specific applications might be sold through a utility-type business model. 1 This idea became very
popular in the late 1960s, but by the mid-1970s the idea faded away when it became clear that the IT-
related technologies of the day were unable to sustain such a futuristic computing model. However,
since the turn of the millennium, the concept has been revitalized. It was during this time of
revitalization that the term cloud computing began to emerge in technology circles. Cloud computing is
a model for enabling convenient, on-demand network access to a shared pool of configurable computing
resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned
and released with minimal management effort or service provider interaction .A Cloud is a type of
parallel and distributed system consisting of a collection of inter-connected and virtualized computers
that are dynamically provisioned and presented as one or more unified computing resource(s) based on
service-level agreements established through negotiation between the service provider and consumers.

When you store your photos online instead of on your home computer, or use webmail or a social
networking site, you are using a cloud computing service. If you are in an organization, and you want to
use, for example, an online invoicing service instead of updating the in-house one you have been using
for many years, that online invoicing service is a ―cloud computing service. Cloud computing is the
delivery of computing services over the Internet. Cloud services, Allow individuals and businesses to
use software and hardware that are managed by third parties at remote locations. Examples of cloud
services include online file storage, social networking sites, webmail, and online business applications.
The cloud computing model allows access to information and computer resources from anywhere.
Cloud computing provides a shared pool of resources, including data storage space, networks, computer
processing power, and specialized corporate and user applications.
Architecture

• Cloud Service Models

• Cloud Deployment Models

• Essential Characteristics of Cloud Computing

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Fig- NIST Visual Model of Cloud Computing Definition

Cloud Service Models

• Cloud Software as a Service (SaaS)

• Cloud Platform as a Service (PaaS)

• Cloud Infrastructure as a Service (I

Infrastructure as a Service (IaaS):-

▪ The capability provided to the consumer is to provision processing, storage, networks, and other
fundamental computing resources.
▪ Consumer is able to deploy and run arbitrary software, which can include operating systems and
applications.
▪ The consumer does not manage or control the underlying cloud infrastructure but has control
over operating systems; storage, deployed applications, and possibly limited control of select
networking components (e.g., host firewalls).

Platform as a Service (PaaS):--

▪ The capability provided to the consumer is to deploy onto the cloud infrastructure consumer
created or acquired applications created using programming languages and tools supported by
the provider.

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

▪ The consumer does not manage or control the underlying cloud infrastructure including network,
servers, operating systems, or storage, but has control over the deployed applications and
possibly application hosting environment configurations.

Software as a Service (SaaS):--

infrastructure.

rough a thin client interface such as a


web browser (e.g., web-based email).

servers, operating systems, storage, or even individual application capabilities, with the possible
exception of limited user specific application configuration settings.

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Cloud Deployment Models:


• Public
• Private
• Community Cloud
• Hybrid Cloud

Public Cloud:The cloud infrastructure is made available to the general public or a large industry
group and is owned by an organization selling cloud services.

Private Cloud: The cloud infrastructure is operated solely for a single organization. It may be
managed by the organization or a third party, and may exist on-premises or off-premises.

Community Cloud: The cloud infrastructure is shared by several organizations and supports a
specific community that has shared concerns (e.g., mission, security requirements, policy, or
compliance considerations). It may be managed by the organizations or a third party and may exist on-
premises or off-premises.

Hybrid Cloud: The cloud infrastructure is a composition of two or more clouds (private, community,
or public) that remain unique entities but are bound together by standardized or Proprietary technology
that enables data and application portability (e.g., cloud bursting for load-balancing between clouds).

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

ESSENTIAL CHARACTERISTICS:-

On-demand self-service:--A consumer can unilaterally provision computing capabilities such as


server time and network storage as needed automatically, without requiring human interaction with a

Broad network access:--Capabilities are available over the network and accessed through standard
mechanisms that promote use by heterogeneous thin or thick client platforms (e.g., mobile phones,
laptops, and PDAs) as well as other traditional or cloud based software services.

Resource pooling:--The provider‘s computing resources are pooled to serve multiple consumers
using a multi-tenant model, with different physical and virtual resources dynamically assigned and
reassigned according to consumer demand.

Rapid elasticity:--Capabilities can be rapidly and elastically provisioned in some cases automatically
- to quickly scale out; and rapidly released to quickly scale in. To the consumer, the capabilities available for
provisioning often appear to be unlimited and can be purchased in any quan

Measured service:--Cloud systems automatically control and optimize resource usage by leveraging
a metering capability at some level of abstraction appropriate to the type of service. Resource usage can
be monitored, controlled, and reported - providing transparency for both the provider and consumer of
the service.

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Practical-2

Aim - Case Study: PAAS (Google App Engine).

Theory:

Platform-as-a-Service (PaaS):
Cloud computing has evolved to include platforms for building and running custom web-based
applications, a concept known as Platform-as-a- Service. PaaS is an outgrowth of the SaaS application
delivery model. The PaaS model makes all of the facilities required to support the complete life cycle of
building and delivering web applications and services entirely available from the Internet, all with no
software downloads or installation for developers, IT managers, or end users. Unlike the IaaS model,
where developers may create a specific operating system instance with homegrown applications
running, PaaS developers are concerned only with webbased development and generally do not care
what operating system is used. PaaS services allow users to focus on innovation rather than complex
infrastructure. Organizations can redirect a significant portion of their budgets to creating applications
that provide real business value instead of worrying about all the infrastructure issues in a roll-your-own
delivery model. The PaaS model is thus driving a new era of mass innovation. Now, developers around
the world can access unlimited computing power. Anyone with an Internet connection can build
powerful applications and easily deploy them to users globally.

Google App Engine:


Architecture:

The Google App Engine (GAE) is Google`s answer to the ongoing trend of Cloud Computing offerings
within the industry. In the traditional sense, GAE is a web application hosting service, allowing for
development and deployment of web-based applications within a predefined runtime environment. Unlike
other cloud-based hosting offerings such as Amazon Web Services that operate on an IaaS level, the GAE
already provides an application infrastructure on the PaaS level. This means that the GAE abstracts from the
underlying hardware and operating system layers by providing the hosted application with a set of
application-oriented services. While this approach is very convenient for developers of such applications, the
rationale behind the GAE is its focus on scalability and usage-based infrastructure as well as payment.

Costs :

Developing and deploying applications for the GAE is generally free of charge but restricted to a certain
amount of traffic generated by the deployed application. Once this limit is reached within a certain time
period, the application stops working. However, this limit can be waived when switching to a billable
quota where the developer can enter a maximum budget that can be spent on an application per day.
Depending on the traffic, once the free quota is reached the application will continue to work until the
maximum budget for this day is reached. Table 1 summarizes some of the in our opinion most important
quotas and corresponding amount per unit that is charged when free resources are depleted and
additional, billable quota is desired.

Features :

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

With a Runtime Environment, the Data store and the App Engine services, the GAE can be divided into
three parts.
Runtime Environment

The GAE runtime environment presents itself as the place where the actual application is executed.
However, the application is only invoked once an HTTP request is processed to the GAE via a web
browser or some other interface, meaning that the application is not constantly running if no invocation
or processing has been done. In case of such an HTTP request, the request handler forwards the request
and the GAE selects one out of many possible Google servers where the application is then instantly
deployed and executed for a certain amount of time (8). The application may then do some computing
and return the result back to the GAE request handler which forwards an HTTP response to the client. It
is important to understand that the application runs completely embedded in this described sandbox
environment but only as long as requests are still coming in or some processing is done within the
application. The reason for this is simple: Applications should only run when they are actually
computing, otherwise they would allocate precious computing power and memory without need. This
paradigm shows already the GAE‘s potential in terms of scalability. Being able to run multiple instances
of one application independently on different servers guarantees for a decent level of scalability.
However, this highly flexible and stateless application execution paradigm has its limitations. Requests
are processed no longer than 30 seconds after which the response has to be returned to the client and the
application is removed from the runtime environment again (8). Obviously this method accepts that for
deploying and starting an application each time a request is processed, an additional lead time is needed
until the application is finally up and running. The GAE tries to encounter this problem by caching the
application in the server memory as long as possible, optimizing for several subsequent requests to the
same application. The type of runtime environment on the Google servers is dependent on the
programming language used. For Java or other languages that have support for Java-based compilers
(such as JRuby, Rhino and Groovy) a Java-based Java Virtual Machine (JVM) is provided. Also, GAE
fully supports the Google Web Toolkit (GWT), a framework for rich web applications. For Python and
related frameworks a Python-based environment is used.

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Persistence and the Datastore


As previously discussed, the stateless execution of applications creates the need for a datastore that
provides a proper way for persistence. Traditionally, the most popular way of persisting data in web
applications has been the use of relational databases. However, setting the focus on high flexibility and
scalability, the GAE uses a different approach for data persistence, called Bigtable (14). Instead of rows
found in a relational database, in Google‘s Bigtable data is stored in entities. Entities are always
associated with a certain kind. These entities have properties, resembling columns in relational database
schemes. But in contrast to relational databases, entities are actually schemaless, as two entities of the
same kind not necessarily have to have the same properties or even the same type of value for a certain
property. The most important difference to relational databases is however the querying of entities
withina Bigtable datastore. In relational databases queries are processed and executed against a database
at application runtime. GAE uses a different approach here. Instead of processing a query at application
runtime, queries are pre-processed during compilation time when a corresponding index is created. This
index is later used at application runtime when the actual query is executed. Thanks to the index, each
query is only a simple table scan where only the exact filter value is searched. This method makes
queries very fast compared to relational databases while updating entities is a lot more expensive.

Transactions are similar to those in relational databases. Each transaction is atomic, meaning that it
either fully succeeds or fails. As described above, one of the advantages of the GAE is its scalability
through concurrent instances of the same application. But what happens when two instances try to start
transactions trying to alter the same entity? The answer to this is quite simple: Only the first instance
gets access to the entity and keeps it until the transaction is completed or eventually failed. In this case
the second instance will receive a concurrency failure exception. The GAE uses a method of handling
such parallel transactions called optimistic concurrency control. It simply denies more than one altering
transaction on an entity and implicates that an application running within the GAE should have a
mechanism trying to get write access to an entity multiple times before finally giving up. Heavily
relying on indexes and optimistic concurrency control, the GAE allows performing queries very fast
even at higher scales while assuring data consistency.

Services

As mentioned earlier, the GAE serves as an abstraction of the underlying hardware and operating
system layers. These abstractions are implemented as services that can be directly called from the actual
application. In fact, the datastore itself is as well a service that is controlled by the runtime environment
of the application.

MEM CACHE

The platform innate memory cache service serves as a short-term storage. As its name suggests, it stores
data in a server‘s memory allowing for faster access compared to the datastore. Memcache is a
nonpersistent data store that should only be used to store temporary data within a series of
computations. Probably the most common use case for Memcache is to store session specific data (15).
Persisting session information in the datastore and executing queries on every page interaction is highly
inefficient over the application lifetime, since session-owner instances are unique per session (16).
Moreover, Memcache is well suited to speed up common datastore queries (8). To interact with the
Memcache GAE supports JCache, a proposed interface standard for memory caches (17).

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

URL FETCH

Because the GAE restrictions do not allow opening sockets (18), a URL Fetch service can be used to send
HTTP or HTTPS requests to other servers on the Internet. This service works asynchronously, giving the
remote server some time to respond while the request handler can do other things in the meantime. After the
server has answered, the URL Fetch service returns response code as well as header and body. Using the
Google Secure Data Connector an application can even access servers behind a company‘s firewall
(8).

MAIL

The GAE also offers a mail service that allows sending and receiving email messages. Mails can be sent
out directly from the application either on behalf of the application‘s administrator or on behalf of
userswith Google Accounts. Moreover, an application can receive emails in the form of HTTP requests
initiated by the App Engine and posted to the app at multiple addresses. In contrast to incoming emails,
outgoing messages may also have an attachment up to 1 MB (8).

XMPP

In analogy to the mail service a similar service exists for instant messaging, allowing an application to
send and receive instant messages when deployed to the GAE. The service allows communication to
and from any instant messaging service compatible to XMPP (8), a set of open technologies for instant
messaging and related tasks (19).

IMAGES

Google also integrated a dedicated image manipulation service into the App Engine. Using this service
images can be resized, rotated, flipped or cropped (18). Additionally it is able to combine several
images into a single one, convert between several image formats and enhance photographs. Of course
the API also provides information about format, dimensions and a histogram of color values (8).

USERS

User authentication with GAE comes in two flavors. Developers can roll their own authentication
service using custom classes, tables and Memcache or simply plug into Google‘s Accounts service.
Since for most applications the time and effort of creating a sign-up page and store user passwords is
not worth the trouble (18), the User service is a very convenient functionality which gives an easy
method for authenticating users within applications. As byproduct thousands of Google Accounts are
leveraged. The User service detects if a user has signed in and otherwise redirect the user to a sign-in
page. Furthermore, it can detect whether the current user is an administrator, which facilitates
implementing admin-only areas within the application (8).

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

OAUTH

The general idea behind OAuth is to allow a user to grant a third party limited permission to access
protected data without sharing username and password with the third party. The OAuth specification
separates between a consumer, which is the application that seeks permission on accessing protected
data, and the service provider who is storing protected data on his users' behalf (20). Using Google
Accounts and the GAE API, applications can be an OAuth service provider (8).

SCHEDULED TASKS AND TASK QUEUES

Because background processing is restricted on the GAE platform, Google introduced task queues as
another built-in functionality (18). When a client requests an application to do certain steps, the
application might not be able to process them right away. This is where the task queues come into play.
Requests that cannot be executed right away are saved in a task queue that controls the correct sequence
of execution. This way, the client gets a response to its request right away, possibly with the indication
that the request will be executed later (13). Similar to the concept of task queues are corn jobs.
Borrowed from the UNIX world, a GAE cron job is a scheduled job that can invoke a request handler at
a prespecified time (8).

BLOBSTORE

The general idea behind the blobstore is to allow applications to handle objects that are much larger than
the size allowed for objects in the datastore service. Blob is short for binary large object and is designed
to serve large files, such as video or high quality images. Although blobs can have up to 2 GB they have
to be processed in portions, one MB at a time. This restriction was introduced to smooth the curve of
datastore traffic. To enable queries for blobs, each has a corresponding blob info record which is
persisted in the datastore (8), e. g. for creating an image database.

ADMINISTRATION CONSOLE

The administration console acts as a management cockpit for GAE applications. It gives the developer
real-time data and information about the current performance of the deployed application and is used to
upload new versions of the source code. At this juncture it is possible to test new versions of the
application and switch the versions presented to the user. Furthermore, access data and logfiles can be
viewed. It also enables analysis of traffic so that quota can be adapted when needed. Also the status of
scheduled tasks can be checked and the administrator is able to browse the applications datastore and
manage indices (8).

App Engine for Business

While the GAE is more targeted towards independent developers in need for a hosting platform for their
medium-sized applications, Google`s recently launched App Engine for Business tries to target the
corporate market. Although technically mostly relying on the described GAE, Google added some
enterprise features and a new pricing scheme to make their cloud computing platform more attractive
for enterprise customers (21). Regarding the features, App Engine for Business includes a central
development manager that allows a central administration of all applications deployed within one
company including access control lists. In addition to that Google now offers a 99.9% service level
agreement as well as premium developer support.

10

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Google also adjusted the pricing scheme for their corporate customers by offering a fixed price of $8 per
user per application, up to a maximum of $1000, per month. Interestingly, unlike the pricing scheme for
the GAE, this offer includes unlimited processing power for a fixed price of $8 per user, application and
month. From a technical point of view, Google tries to accommodate for established industry standards, by
now offering SQL database support in addition to the existing Bigtable datastore described above (8).

APPLICATION DEVELOPMENT USING GOOGLE APP ENGINE

General Idea

In order to evaluate the flexibility and scalability of the GAE we tried to come up with an application
that relies heavily on scalability, i.e. collects large amounts of data from external sources. That way we
hoped to be able to test both persistency and the gathering of data from external sources at large scale.
Therefore our idea has been to develop an application that connects people`s delicious bookmarks with
their respective Facebook accounts. People using our application should be able to see what their
Facebook friends‘ delicious bookmarks are, provided their Facebook friends have such a delicious
account. This way a user can get a visualization of his friends‘ latest topics by looking at a generated tag
cloud giving him a clue about the most common and shared interests.

PLATFORM AS A SERVICE: GOOGLE APP ENGINE:--

The Google cloud, called Google App Engine, is a ‗platform as a service‘ (PaaS) offering. In contrast
with the Amazon infrastructure as a service cloud, where users explicitly provision virtual machines and
control them fully, including installing, compiling and running software on them, a PaaS offering hides
the actual execution environment from users. Instead, a software platform is provided along with an
SDK, using which users develop applications and deploy them on the cloud. The PaaS platform is
responsible for executing the applications, including servicing external service requests, as well as
running scheduled jobs included in the application. By making the actual execution servers transparent
to the user, a PaaS platform is able to share application servers across users who need lower capacities,
as well as automatically scale resources allocated to applications that experience heavy loads. Figure 5.2
depicts a user view of Google App Engine. Users upload code, in either Java or Python, along with
related files, which are stored on the Google File System, a very large scale fault tolerant and redundant
storage system. It is important to note that an application is immediately available on the internet as
soon as it is successfully uploaded (no virtual servers need to be explicitly provisioned as in IaaS).

11

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Resource usage for an application is metered in terms of web requests served and CPUhours actually
spent executing requests or batch jobs. Note that this is very different from the IaaS model: A PaaS
application can be deployed and made globally available 24×7, but charged only when accessed (or if
batch jobs run); in contrast, in an IaaS model merely making an application continuously available
incurs the full cost of keeping at least some of the servers running all the time. Further, deploying
applications in Google App Engine is free, within usage limits; thus applications can be developed and
tried out free and begin to incur cost only when actually accessed by a sufficient volume of requests.
The PaaS model enables Google to provide such a free service because applications do not run in
dedicated virtual machines; a deployed application that is not accessed merely consumes storage for its
code and data and expends no CPU cycles. GAE applications are served by a large number of web
servers in Google‘s data centers that execute requests from end-users across the globe. The web servers
load code from the GFS into memory and serve these requests. Each request to a particular application
is served by any one of GAE‘s web servers; there is no guarantee that the same server will serve
requests to any two requests, even from the same HTTP session. Applications can also specify some
functions to be executed as batch jobs which are run by a scheduler.

Google Datastore:--
Applications persist data in the Google Datastore, which is also (like Amazon SimpleDB) a
nonrelational database. The Datastore allows applications to define structured types (called ‗kinds‘) and
store their instances (called ‗entities‘) in a distributed manner on the GFS file system. While one can
view Datastore ‗kinds‘ as table structures and entities as records, there are important differences
between a relational model and the Datastore, some of which are also illustrated in Figure 5.3.

12

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Unlike a relational schema where all rows in a table have the same set of columns, all entities of a
‗kind‘ need not have the same properties. Instead, additional properties can be added to any entity. This
feature is particularly useful in situations where one cannot foresee all the potential properties in a
model, especially those that occur occasionally for only a small subset of records. For example, a model
storing ‗products‘ of different types (shows, books, etc.) would need to allow each product to have a
different set of features. In a relational model, this would probably be implemented using a separate
FEATURES table, as shown on the bottom left of Figure 5.3. Using the Datastore, this table (‗kind‘) is
not required; instead, each product entity can be assigned a different set of properties at runtime. The
Datastore allows simple queries with conditions, such as the first query shown in Figure 5.3 to retrieve
all customers having names in some lexicographic range.
The query syntax (called GQL) is essentially the same as SQL, but with some restrictions. For example,
all inequality conditions in a querymust be on a single property; so a query that also filtered customers
on, say, their ‗type‘, would be illegal in GQL but allowed in SQL. Relationships between tables in a
relational model are modeled using foreign keys. Thus, each account in the ACCTS table has a pointer
ckey to the customer in the CUSTS table that it belongs to. Relationships are traversed via queries using
foreign keys, such as retrieving all accounts for a particular customer, as shown. The Datastore provides
a more object-oriented approach to relationships in persistent data. Model definitions can include
references to other models; thus each entity of the Accts ‗kind‘ includes a reference to its customer,
which is an entity of the Custs ‗kind.‘ Further, relationships defined by such references can be traversed
in both directions, so not only can one directly access the customer of an account, but also all accounts
of a given customer, without executing any query operation, as shown in the figure.
GQL queries cannot execute joins between models. Joins are critical when using SQL to efficiently
retrieve data from multiple tables. For example, the query shown in the figure retrieves details of all
products bought by a particular customer, for which it needs to join data from the transactions (TXNS),

13

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

products (PRODS) and product features (FEATURES) tables. Even though GQL does not allow joins,
its ability to traverse associations between entities often enables joins to be avoided, as shown in the
figure for the above example: By storing references to customers and products in the Txns model, it is
possible to retrieve all transactions for a given customer through a reverse traversal of the customer
reference. The product references in each transaction then yield all products and their features (as
discussed earlier, a separate Features model is not required because of schema flexibility). It is
important to note that while object relationship traversal can be used as an alternative to joins, this is not
always possible, and when required joins may need to be explicitly executed by application code.
The Google Datastore is a distributed object store where objects (entities) of all GAE applications are
maintained using a large number of servers and the GFS distributed file system. From a user perspective, it
is important to ensure that in spite of sharing a distributed storage scheme with many other users, application
data is (a) retrieved efficiently and (b) atomically updated. The Datastore provides a mechanism to group
entities from different ‗kinds‘ in a hierarchy that is used for both these purposes.
Notice that in Figure 5.3entities of the Accts and Txns ‗kinds‘ are instantiated with a parameter ‗parent‘ that
specifies a particular customer entity, thereby linking these three entities in an ‗entity group‘. The Datastore
ensures that all entities belonging to a particular group are stored close together in the distributed file system
(we shall see how in Chapter 10). The Datastore allows processing steps to be grouped into transactions
wherein updates to data are guaranteed to be atomic; however this also requires that each transaction only
manipulates entities belonging to the same entity group. While this transaction model suffices for most on
line applications, complex batch updates that update many unrelated entities cannot execute atomically,
unlike in a relational database where there are no such restrictions. Amazon
SimpleDB:--
Amazon SimpleDB is also a nonrelational database, in many ways similar to the Google Datastore.
SimpleDB‗domains‘ correspond to ‗kinds‘, and ‗items‘ to entities; each item can have a number of
attribute-value pairs, and different items in a domain can have different sets of attributes, similar to
Datastore entities. Queries on SimpleDB domains can include conditions, including inequality
conditions, on any number of attributes. Further, just as in the Google Datastore, joins are not permitted.
However, SimpleDB does not support object relationships as in Google Datastore, nor does it support
transactions. It is important to note that all data in SimpleDB is replicated for redundancy, just as in
GFS. Because of replication, SimpleDB features an ‗eventual consistency‘ model, wherein data is
guaranteed to be propagated to at least one replica and will eventually reach all replicas, albeit with
some delay. This can result in perceived inconsistency, since an immediate read following a write may
not always yield the result written. In the case of Google Datastore on the other hand, writes succeed
only when all replicas are updated; this avoids inconsistency but also makes writes slower.

14

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Practical-3

Aim - Sketch out and Analyze architecture of Amazon Web Service (AWS).

In 2006, Amazon Web Services (AWS) began offering IT infrastructure services to businesses in the
form of web services -- now commonly known as cloud computing. Amazon Web Services (AWS), a
collection of remote computing services, also called web services, make up a cloud-computingplatform
offered by Amazon.com.

Today, Amazon Web Services provides a highly reliable, scalable, low-cost infrastructure platform in
the cloud that powers hundreds of thousands of businesses in 190 countries around the world.

The Amazon Web Services (AWS) cloud provides a highly reliable and scalable infrastructure for
deploying web-scale solutions, with minimal support and administration costs, and more flexibility than
you’ve come to expect from your own infrastructure, either on-premise or at a data centre facility.

AWS offers variety of infrastructure services today.

Fig: Amazon Web Services

15

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Components:
Amazon Elastic Compute Cloud (Amazon EC2):

It is a web service that provides resizable compute capacity in the cloud. It can bundle the operating
system, application software and associated configuration settings into an Amazon Machine Image
(AMI).

We can purchase On-Demand Instances in which you pay for the instances by the hour or Reserved
Instances in which you pay a low, one-time payment and receive a lower usage rate to run the instance
than with an On-Demand Instance or Spot Instances where you can bid for unused capacity and further
reduce your cost.

Amazon S3 Objects & Buckets:

Amazon S3 is highly durable and distributed data store. With a simple web services interface, you can
store and retrieve large amounts of data as objects in buckets (containers) at any time, from anywhere
on the web using standard HTTP verbs.

Amazon S3 is highly durable and distributed data store.

Amazon Simple DB Domain:

Amazon SimpleDB8 is a web service that provides the core functionality of a database- real-time
lookup and simple querying of structured data - without the operational complexity.

Amazon Elastic Block:

Amazon Elastic Block Storage (EBS) volumes provide network-attached persistent storage to Amazon
EC2 instances.

Amazon EC2 Instances:

Amazon Elastic Block Storage (EBS)5 volumes provide network-attached persistent storage to Amazon
EC2 instances. Point-in-time consistent snapshots of EBS volumes can be created and stored on
Amazon Simple Storage Service (Amazon S3).

Amazon Elastic Book Storage:


Amazon Elastic Block Storage (EBS)5 volumes provide network-attached persistent storage to Amazon
EC2 instances.

Amazon Virtual Private Cloud(Amazon VPC):

It allows you to extend your corporate network into a private cloud contained within AWS. Amazon
VPC uses IPSec tunnel mode that enables you to create a secure connection between a gateway in your
data center and a gateway in AWS.

Amazon Relational Database Service (Amazon RDS):

16

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

It provides an easy way to setup, operate and scale a relational database in the cloud. You can launch a
DB Instance and get access to a full-featured MySQL database and not worry about common database
administration tasks like backups, patch management etc.

Amazon Simple Queue Service (Amazon SQS):

It is a reliable, highly scalable, hosted distributed queue for storing messages as they travel between
computers and application components.

Amazon Cloud Front service:

Amazon Cloud Front service – a web service for content delivery (static or streaming content).

Amazon Simple Notifications Service (Amazon SNS):

It provides a simple way to notify applications or people from the cloud by creating Topics and using a
publish-subscribe protocol.

Amazon Services:

• Amazon API Gateway is a service for publishing, maintaining and securing web service APIs.
• Amazon Cloud Search provides basic full-text search and indexing of textual content.
• Amazon DevPay, currently in limited beta version, is a billing and account management system
for applications that developers have built atop Amazon Web Services.
• Amazon Elastic Transcoder (ETS) provides video transcoding of S3 hosted videos, marketed
primarily as a way to convert source files into mobile-ready versions.
• Amazon Flexible Payments Service (FPS) provides an interface for micropayments.
• Amazon Simple Email Service (SES) provides bulk and transactional email sending.
• Amazon Simple Queue Service (SQS) provides a hosted message queue for web applications.
• Amazon Simple Notification Service (SNS) provides a hosted multi-protocol "push" messaging
for applications.
• Amazon Simple Workflow (SWF) is a workflow service for building scalable, resilient
applications.
• Amazon Cognitor a simple user identity and data synchronization service that helps you securely
manage and synchronize app data for your users across their mobile devices.
• Amazon App Stream a flexible, low-latency service that lets you stream resource intensive
applications and games from the cloud.

17

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Practical-4

Aim - Working of Goggle Drive to make spreadsheet and notes.

Requirement: Google account, Internet Connection.

THEORY:
Google Docs is a free cloud-based suite of tools for creating documents, spreadsheets, presentations,
and more. This tutorial will cover the Spreadsheets application in Google Docs, in addition to showing
you how to access and store your Docs from Google Drive.

Google Docs, Sheets, and Slides are productivity apps that let you create different kinds of online
documents, work on them in real time with other people, and store them in your Google Drive online — all
for free. You can access the documents, spreadsheets, and presentations you create from any computer,
anywhere in the world. (There's even some work you can do without an Internet connection!) This guide will
give you a quick overview of the many things that you can do with Google Docs, Sheets, and Slides.

Google Docs

Google Docs is an online word processor that lets you create and format text documents and collaborate
with other people in real time. Here's what you can do with Google Docs:

• Upload a Word document and convert it to a Google document


• Add flair and formatting to your documents by adjusting margins, spacing, fonts, and colors —
all that fun stuff
• Invite other people to collaborate on a document with you, giving them edit, comment or view
access
• Collaborate online in real time and chat with other collaborators — right from inside the
document View your document's revision history and roll back to any previous version
• Download a Google document to your desktop as a Word, OpenOffice, RTF, PDF, HTML or
zip file
• Translate a document to a different language
• Email your documents to other people as attachments

Google Sheets

Google Sheets is an online spreadsheet app that lets you create and format spreadsheets and
simultaneously work with other people. Here's what you can do with Google Sheets:

• Import and convert Excel, .csv, .txt and .ods formatted data to a Google spreadsheet
• Export Excel, .csv, .txt and .ods formatted data, as well as PDF and HTML files

18

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

• Use formula editing to perform calculations on your data, and use formatting make it look the
way you'd like
• Chat in real time with others who are editing your spreadsheet
• Create charts with your data
• Embed a spreadsheet — or individual sheets of your spreadsheet — on your blog or website

Google Slides

Google Slides is an online presentations app that allows you to show off your work in a visual way.
Here's what you can do with Google Slides:

• Create and edit presentations


• Edit a presentation with friends or coworkers, and share it with others effortlessly
• Import .pptx and .pps files and convert them to Google presentations
• Download your presentations as a PDF, a PPT, or a .txt file
• Insert images and videos into your presentation
• Publish and embed your presentations in a website
Create, name or delete a Google document

Create a Google document

To create a new document, go to your Drive, click the Create button, and select Document.

A window with a new Google document will open, and you'll be able to edit the document, share it with
other people, and collaborate on it in real-time. Google Docs saves your document automatically, and
you can always access it from your Drive.

Name a document

When you create a new document, Google Docs will name it Untitled by default.

To choose a name other than Untitled, click the File menu, and select Rename. From here you can choose
and confirm your document's title. You can also edit the name by clicking the title displayed at the top of the
page, and making your changes in the dialog that appears. Titles can be up to 255 characters long.

Delete a document

Delete an item that you own from your Drive

1. From your Drive, select the item(s) you want to delete.


2. From the More menu, choose Move to trash.
3. If you're deleting a shared document that you own, you'll see an option to change the ownership
of the document.
4. The item will be moved to the Trash.
5. To purge individual items from Trash, select them and choose Delete forever. To purge all your
items click Empty Trash in the upper left.

Create and save a document

There are different ways of getting started using Google documents: you can create a new online
document, you can upload an existing one, or you can use a template from our templates gallery.

19

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

To create a new document, go to your Drive, click the red Create button, and select Document from the
drop-down menu.

As soon as you name the document or start typing, Google Docs will automatically save your work
every few seconds. At the top of the document, you'll see text that indicates when your document was
last saved. You can access your document at any time by opening your Drive at http://drive.google.com.

To save a copy of a document to your computer, you can download it. In your document, go to the File menu
and point your mouse to the Download as option. Select one of the following file types: HTML (zipped),
RTF, Word, Open Office, PDF, and plain text. Your document will download to your computer.

Upload a document

You can upload existing documents to Google documents at any time. When you're uploading, you can
either keep your document in its original file type or convert it to Google Docs format. Converting your
document to Google Docs format allows you to edit and collaborate online from any computer.
Note: When uploaded, images within a document are left as images (rather than being converted to text
by Optical Character Recognition technology).

You can upload the following file types:

• .html
• .txt
• .odt
• .rtf
• .doc and .docx
• .pdf

Follow these steps to upload a document:

1. Click the Upload icon in the top left of your Documents List.
2. Click Files..., and select the document you'd like to upload.
3. Click Open.
4. Check the box next to 'Convert documents, presentations, spreadsheets, and drawings to the
corresponding Google Docs format' if you'd like to be able to edit and collaborate on the
document online. Uploaded document files that are converted to Google documents format can't
be larger than 1 MB.
5. Click Start upload. The uploaded file will appear in your Documents List.

20

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Practical-5

Aim - Write Literature review on Virtual Machine Migration Techniques in Cloud Computing.

In the IT industry Cloud Computing is an emerging area. Now a day’s whole the IT business migrated
to the usage of the cloud. Cloud Computing provides access to computing resources for a fee or pay as
per usage model. Client applications and services can be hosted in cloud. Number of user increase to the
usage of cloud services, so at availability of the resources and based on the demand of resources to
satisfy the user requirement Virtual Machine Migration is necessary. In cloud computing, Virtual
Machine Migration is a useful tool for migrating Operating System instances across multiple physical
machines. It is used to load balancing, fault management, low-level system maintenance and reduce
energy consumption. There are various techniques for Virtual Machine Migration. This paper survey the
various Virtual Machine Migration techniques.

1. Introduction
Cloud computing distributes the computing tasks to the resource pool made from a large number of
computers. Virtualization assigns a logical name for a physical resource and then provides a pointer to
that physical resource when a request is made. Virtualization can also be defined as the abstraction of
the four computing resources (storage, processing power, memory, and network or I/O). The
virtualization technology introduces software abstraction layer which is called Virtual Machine Monitor
(VMM) or hypervisor. Basically, two virtualization approaches were used. In hosted architecture, the
virtualization layer was installed as an application on top of operating system and it supports broader
range of hardware. Hypervisor (bare-metal) architecture installs virtualization layer directly on standard
x86 hardware. There are three virtualization techniques. Full Virtualization: All operating systems in
full virtualization communicate directly with the VM hypervisor, so guest operating systems do not
require any modification. Guest operating systems in full virtualization systems are generally faster than
other virtualization schemes. Para Virtualization: Para virtualization requires that the host operating
system provide a virtual machine interface for the guest operating system and that the guest access
hardware through that host VM. An operating system running as a guest on a para virtualization system
must be ported to work with the host interface. Emulation the virtual machine simulates hardware, so it
can be independent of the underlying system hardware. A guest operating system using emulation does
not need to be modified in any way.

21

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

2. VIRTUAL MACHINE MIGRATION


VMs refer to one instance of an operating system along with one or more applications running in an
isolated partition within the computer. There will be multiple virtual machines running on top of a
single physical machine. When one physical host gets overloaded, it may be required to dynamically
transfer certain amount of its load to another machine with minimal interruption to the users. This
process of moving a virtual machine from one physical host to another is termed as migration. In the
past, to move a VM between two physical hosts, it was necessary to shut down the VM, allocate the
needed resources to the new physical host, move the VM files and start the VM in the new host. Virtual
machine Migration has two type of Techniques.

Live Migration: Live migration can be defined as the movement of a virtual machine from one
physical host to another while being powered on. When it is properly carried out, this process takes
place without any noticeable effect from the end user’s point of view.
Regular Migration: Cold migration is the migration of a powered-off virtual machine. With cold
migration, you have the option of moving the associated disks from one data store to another. The
virtual machines are not required to be on a shared storage.

3. Summary of Literature Review


In the Cloud Computing Virtual Machine Migration is major issue for manages load balancing, fault
management, low-level system management and reduce energy consumption. So here we discuss some
Virtual Machine Migration techniques.

3.1 Application-aware Virtual Machine Migration in Data Centers


Part of the challenge is due to the inherent dependencies between VMs comprising a multi-tier application,
which introduce complex load interactions between the underlying physical servers. In this paper introduce
Appware is a novel, computationally efficient scheme for incorporating inter-VM dependencies and the
underlying network topology into VM migration decisions. Appware accepts as input the dependency graph,
The weights which are obtained from measuring the volume of traffic transferred between any two VMs.
The algorithm also takes as input the network diameter Distanceof the network topology of physical
machines, and an existing mapping of physical machine and virtual machine. Also, the migration set
indicates that VM should be migrated to physical machine. For each overloaded virtual machine, the total
communication weight of all its incoming edges is computed. The overloaded virtual

22

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

machines are then sorted in descending order of their total weight. The migration decision procedure is
repeated until a mapping has been identified for all overloaded virtual machines or no other mappings
can be found. Using simulations, it show that proposed method decreases network traffic by up to
81%compared to a other alternative VM migration method that is not application-aware.

3.2 Minimizing Communication Traffic in Data Centers with Power-aware VM Placement


In this paper, can save the cost of resource usage and improve the performance of applications at the
same time by optimizing the placement of VM. The objective is to minimize the total traffic in a data
center. In this paper, consolidate VMs with high inter traffic on the same PM, because VMs on the same
PM can communicate using only memory copy. Also need to reduce the number of active PMs to save
the power cost. This paper first formulates the VM placement as an optimization problem. Also propose
a heuristic algorithm based on clustering to deploy VMs on PMs. A greedy algorithm is used for the
online scenario. For VM Consolidation there are two algorithm are proposed in this paper.
1) K-means Clustering
2) K-means Clustering for VM Consolidation.
The experiment results on data sets collected from a data center. Results show that this algorithm can
efficiently reduce the overall traffic and power cost in the data center.

3.3 Policy-based Agents for Virtual Machine Migration in Cloud Data Centers
In this paper, an agent-based distributed approach capable of balancing different types of workloads like
memory workload by using virtual machine live migration is proposed. Agents acting as server managers are
equipped with 1) a collaborative workload balancing protocol, and 2) a set of workload balancing policies
like resource usage migration thresholds and virtual machine migration heuristics to simultaneously consider
both server heterogeneity and virtual machine heterogeneity. The agent-based framework for Cloud data
center workload balancing consists of virtual machine agents, server manager agents, front-end agents, and
user agents. Virtual machine agents are in charge of monitoring VM resource usages. VMAs send
monitoring reports to server manager agents. VMAs are deployed on VMs. Server manager agents are in
charge of: 1) handling VM request allocations, 2) allocating and removing VMs,
3) triggering VM migrations 4) migrating VMs 5) collecting and summarizing VM monitoring
information 6) setting the administrator-defined workload balancing policies. The experimental results
show that policy-based workload balancing is effectively achieved despite dealing with server
heterogeneity and heterogeneous workloads.
3.4 Network Aware VM Migration in Cloud Data Centers
In this paper, evaluate the performance of VM Patrol in an experimental GENI test bed characterized by
wide-area network dynamics and realistic traffic scenarios. In this paper deploy Open Flow end to end
QoS policies to reserve minimum bandwidths required for successful VM Migration. Migration of VMs
generates variable amount of network traffic between the source and the destination hosts. The volume
of the network traffic depends on the VM’s image size, its page dirty rate, the migration completion
deadline and the available bandwidth along the migration path. The cost of migration model is based on
a pre-copy live migration technique. The results indicate that time taken to complete VM Migration
depends on VMs memory size, VM page dirty rate and the available bandwidth. The results also
indicate that length of stop copy phase and minimum required progress amount are critical parameters in
estimating the VM migration cost.

3.5 Geography Aware Virtual Machine Migration for Distributed Cloud Data Centers
This paper focuses on decreasing the average distance of the clients from the VMs that the client is using
periodically using VM migration to move VMs such that average distance is reduced. In this paper,
introduce a framework for a system that identifies potential candidate VMs to migrate to target data centers
chosen at run-time. The approach is use the geographical distribution of requests at the end of a period of
time with ti representing the end time of the ith time period to place virtual machines for the

23

Enrollment No. 140640116012


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

next time period. This may require multiple VMs to be migrated. Each request is classified into a region
based on the origin of the request. There are mainly four modules in this paper. The classification
module is used as input to an algorithm that determines if a VM should be migrated to a different data
center. The selector Module use the load-distance and distance metric. The migration module receives
the set of candidate VMs for migration. For each candidate VM, the migration module first determines
the migration cost. the migration module initiates the migration through the use of the SDN controller
that programmatically removes the matching forwarding entries of the switches for migrating VMs and
redirects the flows to the appropriate data center.

4. CONCLUSION
Cloud computing is the new paradigm where computing is on demand service. Virtual machine
migration plays important role in cloud computing. Virtual machine migration is a major issue in cloud
computing. With the increase in the popularity of cloud computing systems, virtual machine migrations
across data centers and diverse resource pools will be greatly beneficial to data center administrators. In
this our survey paper, we conclude that Network Aware, Power Aware, Application Aware,
Geographical Aware Virtual machine Migration Techniques which provide Migration technique over
the cloud. In this paper we study about the cloud computing, Virtualization, Types of virtualization,
virtual machine, virtual machine migration and its various techniques.
Practical-6
Aim - Installation and Configuration of Justcloud.

Requirement: Justcloud exe File

THEORY:
Professional Cloud Storage from JustCloud is Simple, Fast and Secure. Just Cloud will automatically
backup the documents, photos, music and videos stored on your computer, to the cloud so you are never
without files again.

Installation :

1. Download Software this link http://www.justcloud.com/download/

24

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

2. By following these steps you will download and install the JustCloud software
application on this computer. This software will automatically start backing up
files from your computer and saving them securely in an online cloud user
account. Your free account gives you 15MB storage space or 50 files for 14
days. Once installed a sync folder will be added to your desktop for you to easily
drag and drop files you wish to backup.

25

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Practical-7
Aim - Sketch out and analyse architecture of Microsoft Azure.

Microsoft Azure:
Microsoft’s Windows Azure Platform provides a familiar and flexible environment to drive and support
specific needs and services of the development team, customers and users.

Microsoft Azure Services:

Fig: 1 Microsoft Azure Services


Azure includes following services:

i. Compute Service
ii. Data Services .
iii Application Services
iv. Network Services

Compute Service:
This includes the Microsoft azure cloud services, Azure virtual machines, Azure Websites, and Azure
Mobile Services.
Windows Azure compute services provide the processing power required for cloud applications to be
able to run. Windows Azure currently offers four different compute services:

Virtual Machines
This service provides you with a general-purpose computing environment that lets you create, deploy,
and manage virtual machines running in the Windows Azure cloud.
Web Sites
This service provides you with a managed web environment you can use to create new websites or
migrate your existing business website into the cloud.
Cloud Services
This service allows you to build and deploy highly available and almost infinitely scalable applications
with low administration costs using almost any programming langua
26
th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.
Mobile Services
This service provides a turnkey solution for building and deploying apps and storing data for mobile
devices. Data Services:
Windows Azure data services provide you with different ways of storing, managing, safeguarding,
analyzing, and reporting business data. Windows Azure currently offers five different data services:
Data Management
This service lets you store your business data in SQL databases, either with dedicated Microsoft SQL
Server virtual machines, using Windows Azure SQL Database, using NoSQL Tables via REST, or
using BLOB storage.
Business Analytics
This service enables ease of discovery and data enrichment using Microsoft SQL Server Reporting and
Analysis Services or Microsoft SharePoint Server running in a virtual machine, Windows Azure SQL
Reporting, the Windows Azure Marketplace, or HDInsight, a Hadoop implementation for Big Data.
HDInsight
This is Microsoft’s Hadoop-based service which brings a 100 percent Apache Hadoop solution to the
cloud.
Cache
This service provides a distributed caching solution that can help speed up your cloud-
based applications and reduce database load. Backup
This service helps you protect your server data offsite by using automated and manual backups to
Windows Azure.
Recovery Manager
Windows Azure Hyper-V Recovery Manager helps you protect business critical services by
coordinating the replication and recovery of System Center 2012 private clouds at a secondary
location.
App services
Windows Azure app services provide you with ways of enhancing the performance, security,
discoverability, and integration of your cloud apps that are running. Windows Azure currently offers
seven different app services: Media Services
This service allows you to build workflows for the creation, management, and distribution of media
using the Windows Azure public cloud.
Messaging
This consists of two services (Windows Azure Service Bus and Windows Azure Queue) that allow you to
keep your apps connected across your private cloud environment and the Windows Azure public cloud.
Notification Hubs
This service provides a highly scalable, cross-platform push notification infrastructure for applications
running on mobile devices.

BizTalk Services
This service provides Business-to-Business (B2B) and Enterprise Application Integration (EAI)
capabilities for delivering cloud and hybrid integration solutions.
Active Directory
This service provides you with identity management and access control capabilities for your cloud
applications.
Multifactor Authentication
This service provides an extra layer of authentication, in addition to the user’s account credentials, in
order to better secure access for both on-premises and cloud applications.

29

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Practical-8
Aim - Sketch out and analyze architecture of Aneka / Eucalyptus / KVM identify different entities
to understand the structure of it.

ANEKA
Aneka is a platform and a framework for developing distributed applications on the Cloud. It harnesses
the spare CPU cycles of a heterogeneous network of desktop PCs and servers or datacenters on demand.
Aneka provides developers with a rich set of APIs for transparently exploiting such resources and
expressing the business logic of applications by using the preferred programming abstractions.
Aneka is based on the .NET framework and this is what makes it unique from a technology point
of view as opposed to the widely available Java based solutions. While mostly designed to exploit
the computing power of Windows based machines, which are most common within an enterprise
environment, Aneka is portable over different platforms and operating systems.

Fig:1 Aneka Architecture

30

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

The Aneka based computing cloud is a collection of physical and virtualized resources connected
through a network, which could be the Internet or a private intranet. Each of these resources hosts an
instance of the Aneka Container representing the runtime environment in which the distributed
applications are executed.

The architecture and the implementation of the Container play a key role in supporting these three
features: the Aneka cloud is flexible because the collection of services available on the container can be
customized and deployed according to the specific needs of the application.

Anatomy of the Aneka Container

The Container represents the basic deployment unit of Aneka based Clouds. The network of containers
defining the middleware of Aneka constitutes the runtime environment hosting the execution of
distributed applications. Aneka strongly relies on a Service Oriented Architecture and the Container is a
lightweight component providing basic node management features.

The stack of services that can be found in a common deployment of the Container. It is possible to
identify four major groups of services:
• Fabric Services
• Foundation Services
• Execution Services
• Transversal Services

Fabric Services

Fabric services define the lowest level of the software stack representing the Aneka Container. They
provide access to the resource provisioning subsystem and to the hardware of the hosting machine.
Resource provisioning services are in charge of dynamically providing new nodes on demand by relying
on virtualization technologies, while hardware profile services provide a platform independent interface
for collecting performance information and querying the properties of the host operating system and
hardware.

Foundation Services
Together with the fabric services the foundation services represent the core of the Aneka middleware on
top of which Container customization takes place. Foundation services constitute the pillars of the
Aneka middleware and are mostly concerned with providing runtime support for execution services and
applications.

Execution Services
Execution services identify the set of services that are directly involved in the execution of distributed
applications in the Aneka Cloud. The application model enforced by Aneka represents a distributed
application as a collection of jobs. For any specific programming model implemented in Aneka at least
two components are required providing execution support: Scheduling ServiceandExecution Service

Transversal Services
Aneka provides additional services that affect all the layers of the software stack implemented in the
Container. For this reason they are called transversal services, such as the persistence layer and the
security infrastructure.

31

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Portability and Interoperability


Aneka is a Platform as a Service implementation of the Cloud Computing model and necessarily relies
on the existing virtual and physical infrastructure for providing its services. More specifically, being
developed on top of the Common Language Infrastructure, it requires an implementation of the ECMA
335 specification such as the .NET framework or Mono.

EUCALYPTUS
Elastic Utility Computing Architecture Linking Your Programs ToUseful System is an open source
software infrastructure for implementing on-premise clouds on existing IT and service provider
infrastructure.

Fig: Eucalyptus

Eucalyptus has six components:

The Cloud Controller


It is a java program that offers EC2-compatible interfaces, as well as a web interface to the outside
world. In addition to handling incoming requests, the CLC acts as the administrative interface for cloud
management and performs high-level resource scheduling and system accounting. The CLC accepts
user API requests from command-line interfaces like euca2ools or GUI-based tools like the Eucalyptus
User Console and manages the underlying compute, storage, and network resources.

The Cluster Controller


It is written in C and acts as the front end for a cluster within a Eucalyptus cloud and communicates
with the Storage Controller and Node Controller. It manages instance (i.e., virtual machines) execution
and Service Level Agreements (SLAs) per cluster.

32

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

The Storage Controller


It is written in Java and is the Eucalyptus equivalent to AWS EBS. It communicates with the Cluster
Controller and Node Controller and manages Eucalyptus block volumes and snapshots to the instances
within its specific cluster. If an instance requires writing persistent data to memory outside of the
cluster, it would need to write to Walrus, which is available to any instance in any cluster.

Fig: Eucalyptus component

The VMware Broker

It is an optional component that provides an AWS-compatible interface for VMware environments and
physically runs on the Cluster Controller. The VMware Broker overlays existing ESX/ESXi hosts and
transforms Eucalyptus Machine Images (EMIs) to VMware virtual disks. The VMware Broker mediates
interactions between the Cluster Controller and VMware and can connect directly to either ESX/ESXi
hosts or to vCenter Server.

The Node Controller

It is written in C and hosts the virtual machine instances and manages the virtual network endpoints. It
downloads and caches images from Walrus as well as creates and caches instances. While there is no
theoretical limit to the number of Node Controllers per cluster, performance limits do exist.

Walrus

It is also written in Java, is the Eucalyptus equivalent to AWS Simple Storage Service (S3). Walrus
offers persistent storage to all of the virtual machines in the Eucalyptus cloud and can be used as a
simple HTTP put/get storage as a service solution. There are no data type restrictions for Walrus, and it
can contain images.

33

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

KVM Kernel-based Virtual Machine

KVM is a system-virtualization solution that uses full virtualization to run VMs. It has a small code base,
since it was designed to leverage the facilities provided by hardware support for virtualization.

Architecture

In a normal linux environment each process runs either in user-mode or in kernel- mode. KVM introduces a
third mode, the guest-mode.
A process in guest-mode has its own kernel-mode and user-mode. Thus, it is able to run an operating system.
Such processes are representing the VMs running on a KVM host. In this it states what the modes are used
for from a hosts point of view:

user-mode: I/O when guest needs to access devices


kernel-mode: switch into guest-mode and handle exits due to I/O operations
guest-mode: execute guest code, which is the guest OS except I/O

Resource management

The KVM developers aimed to reuse as much code as possible. Due to that they mainly modi_ed the
linux memory management, to allow mapping physical memory into the VMs address space. In modern
operating systems there are many more processes than CPUs available to run them. The scheduler of an
operating system computes an order in that each process is assigned to one of the available CPUs.

The KVM control interface

Once the KVM kernel module has been loaded, the /dev/kvm device node appears in the _lesystem. This is a
special device node that represents the interface of KVM. It allows to control the hypervisor through a set of
ioctls. These are com- monly used in certain operating systems as an interface for processes running in user-
mode to communicate with a driver. The ioctl() system call allows to exe- cute several

34

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

operations to create new virtual machines, assign memory to a virtual machine, assign and start virtual
CPUs.
KVM is designed as an kernel module, once loaded it turns linux into an VMM. Since the developers
didn't want to reinvent the wheel, KVM relies on the mecha- nisms of the kernel to schedule computing
power and bene_ts from the of the box driver support. But the memory management has been extended
to be capable to manage memory that is assigned to the address space of a VM.

Practical-9
Aim - Create a scenario in Aneka to create a Master node & worker node configure. And run
application on it.

Step:1 Start Aneka


Step:2 Add new Machine…. File Add machine

35

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Now create Aneka Cloud


Step: 3 Install Master Container Installed Machine→Install Container

Step: 4 Security Configurations

36

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Step: 5 Persistence Configurations

Step: 6 Cost and Software Appliances

37

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Step:7 Failover Configuration

Step: 8 Service Configurations

38

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Step: 9 Summary

39

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Installing Worker Container

Step: 10 Install Worker Container

40

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Step: 11 Add repositories of Worker container in to Master container

Step: 12 Container Management Right Click on installed Container→log→Monitor

41

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Step: 13 Start any Application


Aneka→Manjra software→Manjalboot→Run Application

42

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Step: 13 View→Statics

43

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

Practical-10 Aim
- Installation and Configuration of Hadoop.

Theory:
• Hadoop-1.2.1 Installation Steps for Single-Node Cluster (On Ubuntu 12.04)

• Download and install VMware Player depending on your Host OS (32 bit or 64 bit
https://my.vmware.com/web/vmware/free#desktop_end_user_computing/vmware_player/6_0

• Download the .iso image file of Ubuntu 12.04 LTS (32-bit or 64-bit depending on
your requirements) http://www.ubuntu.com/download/desktop

• Install Ubuntu from image in VMware. (For efficient use, configure the Virtual Machine to
have at least 2GB (4GB preferred) of RAM and at least 2 cores of processor

----------------JAVA INSTALLATION---------------

• sudo mkdir -p /usr/local/java

• cd ~/Downloads

• sudo cp -r jdk-8-linux-i586.tar.gz /usr/local/java

• sudo cp -r jre-8-linux-i586.tar.gz /usr/local/java

• cd /usr/local/java

• sudo tar xvzf jdk-8-linux-i586.tar.gz

• sudo tar xvzf jre-8-linux-i586.tar.gz

• ls a jdk1.8.0 jre1.8.0 jdk-8-linux-i586.tar.gz jre-8-linux-i586.tar.gz

• sudo gedit /etc/profile

• JAVA_HOME=/usr/local/java/jdk1.7.0_4 PATH=$PATH:$HOME/bin:$JAVA_HOME

/binJRE_HOME=/usr/local/java/jdk1.7.0_45/j rePATH=$PATH:$HOME/bin:$JRE_HOME/

binHADOOP_HOME=/home/hadoop/adoop-1.2.1

• PATH=$PATH:$HADOOP_HOME/binexport JAVA_HOME export JRE_HOME export

PATH

• sudo update-alternatives --install "/usr/bin/java" "java"

"/usr/local/java/jdk1.8.0/jre/bin/java"1

44

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

• sudo update-alternatives --install "/usr/bin/javac" "javac" "/usr/local/java/jdk1.8.0/bin/javac"

13.sudo update-alternatives --install "/usr/bin/javaws"

"javaws"/usr/local/java/jdk1.8.0/bin/javaws" 1

• sudo update-alternatives --set java /usr/local/java/jdk1.8.0/jre/bin/java •

sudo update-alternatives --set javac /usr/local/java/jdk1.8.0/bin/javac

• sudo update-alternatives --set javaws /usr/local/java/jdk1.8.0/bin/javaws

• . /etc/profile

• java -version

• java version "1.8.0"

• Java(TM) SE Runtime Environment (build 1.8.0-b132)

• Java HotSpot(TM) Client VM (build 25.0-b70, mixed mode)

---------------------HADOOP INSTALLATION------------------

• open Home

• create a floder hadoop

• copy from downloads hadoop-1.2.1.tar.gz to hadoop

• right click on hadoop-1.2.1.tar.gz and Extract Here

• cd hadoop/

• ls -a

. .. hadoop-1.2.1 hadoop-1.2.1.tar.gz 25. edit the file conf/hadoop-env.sh

# The java implementation to use. Required.

export JAVA_HOME=/usr/local/java/jdk1.8.0

26. cd hadoop-1.2.1

------------------STANDALONE OPERATION----------------

45

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

• mkdir input

• cp conf/*.xml input

• bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+'

• cat output/*

----------------PSEUDO DISTRIBUTED OPERATION---------------//WORDCOUNT

• conf/core-site.xml: <configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

</configuration>

• conf/hdfs-site.xml:

<configuration> <property>

<name>dfs.replication</name>

<value>1</value>

</property>

</configuration>

• conf/mapred-site.xml:

<configuration> <property>

<name>mapred.job.tracker</name>

<value>localhost:9001</value>

</property>

</configuration>

• ssh localhost

• ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

• cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

• bin/hadoop namenode -format

46

Enrollment No. 150640116021


th
CLOUD INFRASTRUCTURE AND SERVICES (2180712) IT 8 Sem.

• bin/start-all.sh

Run the following command to verify that hadoop services are running $

jps If everything was successful, you should see following services running

2583 DataNode

2970 ResourceManager

3461 Jps

3177 NodeManager

2361 NameNode

2840 SecondaryNameNode

47

Enrollment No. 150640116021

Potrebbero piacerti anche