Cloud Monitoring, Provisioning and Administration Patterns

Chapter 7.
Monitoring, Provisioning and

Administration Patterns
Usage Monitoring
Pay-as-You-Go
Realtime Resource Availability
Rapid Provisioning
Platform Provisioning
Bare-Metal Provisioning
Automated Administration
Centralized Remote Administration
Resource Management
Self-Provisioning
Power Consumption Reduction
A primary goal for cloud providers is to deliver not only affordable but easy-touse resources for organizational computing requirements. The patterns in this
chapter are primarily provided in support of that goal.
Monitoring and situational awareness are enabled by patterns such as Usage
Monitoring (285), Pay-as-You-Go (288), and Realtime Resource Availability
(292) supporting cloud consumers with critical SLA assessment and verification
capabilities.
Automated Administration (310) is an example of a solution to how the cloud
provider can automate provisioning requirements met on-demand. This pattern
can be combined with Centralized Remote Administration (315) and Rapid
Provisioning (295) in support of Platform Provisioning (301) to relieve cloud
consumers of the burden of implementing the underlying infrastructure of their
cloud environments.
USAGE MONITORING
How can IT resource usage be measured?
Problem
When making IT resources available for access and shared usage by multiple
cloud consumers, the manner in which actual usage occurs can be highly
unpredictable. IT resources may be subject to high usage volumes by individual
cloud consumers performing a large amount of runtime processing or high
volumes of cloud service consumers concurrently accessing the virtualized
instances of the IT resources. Either way, infinite runtime scenarios can
develop, leading to possible runtime exception conditions, security breaches,
and other types of runtime failure.
Furthermore, for IT resources and cloud services to be commercialized in
support of the Pay-as-You-Go (288) pattern, the cloud architecture needs to
support the ability for runtime usage to be accurately measured.
Solution
IT artifacts and systems capable of monitoring, collecting and processing usage
data and metrics are incorporated into the cloud architecture to enable the
inherent measured usage characteristic of cloud environments, and to further
offer a range of specialized usage monitoring and data collection functions
(Figure 7.1).
Figure 7.1 A usage monitor measures IT resource use and collects corresponding usage data that
is stored and made available for reporting purposes.
Application
This pattern is fundamentally applied via the use of the cloud usage monitor
mechanism. This broad, infrastructure-level mechanism encompasses a variety
of specialized monitoring-based mechanisms that fulfill different forms of usage
monitoring requirements and can be implemented as a monitoring agent,
resource agent, or polling agent.
Regardless of which type of cloud usage monitor is used, there are common
components that can accompany the implementation of a monitoring IT
resource:
Usage Monitoring Station A system that the cloud usage monitor directly
communicates with and to which it may transmit collected usage data.
Usage Database A repository used to store usage data received by usage
monitoring stations or directly by cloud usage monitors.
Data Saver A middleware component used to save and update collected

usage data.
Usage Reporter A middleware component used to retrieve usage data from
the usage database and present it in human-readable reports. The usage reporter
is generally integrated with a usage and administration portal.
Custom Reporter A tool used to design custom usage reports.
Mechanisms
Audit Monitor This mechanism is a used when auditing-related usage
monitoring is required.
Automated Scaling Listener The mechanism is used when monitoring
pertaining to dynamic scaling is required.
Cloud Usage Monitor This mechanism represents a range of specialized
monitoring programs and agents that can fulfill specialized applications of this
pattern.
Load Balancer This monitor appraises runtime workload usage prior to
carrying out load balancing algorithms.
Pay-Per-Use Monitor This mechanism is used when billing-related usage
monitoring is required.
SLA Monitor This mechanism is used when quality-of-service and other
SLA-related usage monitoring is required.
PAY-AS-YOU-GO
How can a cloud consumer be billed accurately for the actual amount of its IT
resource usage?
Problem
When purchasing an IT resource, such as a physical server, the total cost of
purchase and subsequent ownership may not correspond with the return on
investment (ROI) of the servers actual runtime usage. Similarly, when leasing
an IT resource for a fixed fee (or when leasing coarse portions of an IT resource
at fixed fees), the amount of actual usage of the IT resource may not correspond
to the capacity for which fees were charged (Figure 7.2).
Figure 7.2 A cloud consumer that leases and pays fixed fees for two entire virtual servers may
not actually use their entire processing capacity.
Solution
A cloud architecture is established that is capable of collecting actual cloud
consumer usage data and providing it to a management system used to process
and report actual cloud consumer usage data for billing and chargeback
purposes.
Application
This pattern is applied together with the Usage Monitoring (285) pattern to
establish the use of the pay-per-use monitor mechanism as the primary
component responsible for collecting and storing billing-related usage data at
runtime. Also implemented by this pattern is the billing management system
mechanism that processes, reports on, and generates billing information and
documents based on the collected usage data.
The following steps are shown in Figure 7.3:
1. The cloud consumer accesses the cloud service via the usage and
administration portal.
2. The pay-per-use monitor logs the usage data.
3. The pay-per-use monitor sends usage data to a usage monitor station.

4. The data from the pay-per-use monitor is normalized and saved to a usage
database.
Figure 7.3 A basic cloud architecture resulting from the application of the Pay-as-You-Go
pattern.
A human-readable report of realtime usage is published on the usage and

administration portal for the cloud consumer to view.
Related components that can also comprise this cloud architecture include:
Data Source Loader A program that collects data from a usage database and
delivers it to the chargeback calculation engine for processing.
Chargeback Calculation Engine After retrieving the required data, this
engine generates chargeback or billing documents based on the cloud providers
pricing metrics.
Chargeback Database After determining the charges for usage, this
information is stored in a database for future use and reporting.
Note that a given billing management system may include some or all of these
components. Figure 7.4shows a cloud architecture comprised of pay-per-usage
and billing components.
Figure 7.4 An example of a cloud architecture comprised of pay-per-usage and billing

components.
Mechanisms
Billing Management System This fundamental mechanism is responsible for
calculating recurring service costs in accordance with the goals of this pattern.
Cloud Usage Monitor Specialized cloud usage monitors may collect IT
resource usage information that is stored in a usage database that may be used
by the billing management system for some of its calculations.
Pay-Per-Use Monitor This monitor is core to the application of the Pay-asYou-Go pattern in that it is responsible for collecting runtime IT resource usage
data used by the billing management system.
REALTIME RESOURCE AVAILABILITY
How can cloud consumers access current availability status information for IT
resources?
Problem
An SLA includes various metrics to define service quality guarantees, a primary
one of which is service availability. For a cloud consumer to be able to check on
and assess the availability of a cloud service or IT resource, it needs to be able
to receive up-to-date availability information on-demand. Most management
systems used in clouds provide tools for generating usage and status reports
after data is collected and stored and then subsequently requested by cloud
consumers or cloud providers. The final step is for the data to be presented and
rendered in a report. Because of the time it takes to complete these steps, the
cloud consumer is given only historical availability data.
Solution
The system established by this pattern is similar to conventional usage data
collecting and reporting architectures in that it consists of a usage monitor that
collects the availability data and sends it to a monitoring station for storage.
What distinguishes this system is the use of an availability reporter component
that is capable of instantly retrieving and streaming the availability data so that
it can be sent, on an on-going basis, to a front-end for viewing.
Application
This pattern is commonly applied together with the Centralized Remote
Administration (315) pattern, as the usage and administration portal is generally
the most convenient location for the streamed availability data to be displayed.
1. A specialized monitor (not shown) collects and stores availability data in a
dedicated database as part of a monitoring station.
2. The availability reporter instantly extrapolates the availability data from the
monitoring station and streams it to the usage and administration portal.
3. The cloud consumer can view the realtime stream of availability report data
via the usage and administration portal.
Figure 7.5 A cloud architecture resulting from the application of the Realtime Resource
Availability pattern.
In the absence of a usage and administration portal, a separate service

availability portal can be created, dedicated to the display of IT resource
availability and status data. This type of system can also involve the use of a
report access manager to manage the list of authorized IT resources for which a
given cloud consumer can view availability and status data.
Mechanisms
Audit Monitor The audit monitor mechanism is related to this pattern in how
it audits realtime IT resource availability data and the publication of the
availability reports themselves.
Cloud Usage Monitor Specialized cloud usage monitors may track runtime
usage data relevant to the reporting of IT resource availability information.
SLA Management System The SLA availability guarantees inputted and
managed by this system directly relate to the availability reporting produced by
the application of this pattern. The SLA availability metrics and values may be
displayed on a usage and administration portal alongside the realtime
availability data.
SLA Monitor The SLA monitor mechanism is responsible for collecting the
uptime and availability information of IT resources.
RAPID PROVISIONING
How can the provisioning of IT resources be automated and made available to

cloud consumers on-demand?
Problem
A conventional provisioning process can involve a number of tasks that are
traditionally completed manually by administrators and technology experts that
prepare the requested IT resources as per pre-packaged specifications or as per
custom client requests. In cloud environments, where higher volumes of
customers are serviced and where the average customer requests higher volumes
of IT resources, manual provisioning processes are inadequate and can even

lead to unreasonable risk due to uncompetitive response times and human error.
For example, consider a cloud consumer that requests 25 Windows servers be
installed, configured and updated, along with some applications. Half of the
applications are to be identical installations while the other half need to be
customized. In this scenario, each deployment of the operating system can take
30 minutes, followed by additional time and effort to apply necessary security
patches and operating system updates (several of which may require server
reboots). Finally, the applications need to be deployed and configured. A
manual or semi-automated approach to this project will require an extended
amount of time and will introduce a reasonable chance of human error
contributing to mistakes in one or more of the new server installations.
Solution
A sophisticated system is introduced to enable the automation of the
provisioning of a wide range of IT resources, individually or together. The
system relies on an automated provisioning program, a rapid provisioning
engine, along with scripts and templates to allow for IT resources to be
provisioned on-demand, at the time when the cloud consumer requests the IT
resources via a self-service portal.
Application
The application of this pattern can vary, depending on the types of IT resources
that need to be rapidly provisioned. A multitude of individual components are
available to coordinate and automate various aspects of IT resource
provisioning. The assembly of these components comprises a large part of the
resulting cloud architecture.
Components that can comprise the system include:
Server Templates Templates of virtual image files used for automating the
instantiation of new virtual servers.
Server Images Similar to server templates, but used for provisioning
physical servers instead.
Application Packages Collections of applications and other software that is
packaged for automated deployment.
Application Packager The software used to create application packages.
Custom Scripts Scripts that automate administrative tasks, as part of an
intelligent automation engine.
Sequence Manager A program used to organize sequences of automated
provisioning tasks.
Sequence Logger A component that logs the execution of automated

provisioning task sequences.
Operating System Baseline A configuration template applied after the
operating system is installed to quickly prepare it for usage.
Application Configuration Baseline A configuration template with settings
and environment parameters needed to prepare new applications for usage.
Deployment Data Store The repository that stores virtual images, templates,
scripts, baseline configurations, and other related data.
The system produced by the application of this pattern is typically further
integrated with the self-service portal resulting from the Self-Provisioning (324)
pattern as well as various scripts and the use of the intelligent automation
engine, as part of the application of the Automated Administration (310)
pattern.
The various artifacts used to establish the provisioning systems are typically
stored within a deployment repository supplied by the cloud provider, as shown
in Figure 7.6. Figure 7.7 provides a sample cloud architecture resulting from the
application of Rapid Provisioning.
Figure 7.6 The cloud provider creates a deployment repository that stores system components.
Figure 7.7 A sample cloud architecture resulting from the application of the Rapid Provisioning
pattern.
The preceding example is significantly simplified. The following step-by-step

descriptions provide better insight into the mechanics behind a typical rapid
provisioning engine. This scenario involves a number of the previously listed
system components.
1. A cloud consumer requests a new server through the self-service portal.
2. The sequence manager forwards the request to the deployment engine for an
operating system to be prepared.
3. If the request is for building a virtual server, then the deployment engine uses
the virtual server templates for provisioning. Otherwise, the deployment engine
sends the request to provision a physical server.
4. If there was an already pre-defined image for the type of operating system
requested, then it will be used for the provisioning of the operating system.
Alternatively, the regular deployment process will be followed to install the
operating system.
5. When the operating system is ready, the deployment engine informs the
sequence manager.
6. The sequence manager updates the logs and sends them to the sequence
logger for storage.
7. The sequence manager requests that the deployment engine apply the
operating system baseline to the provisioned operating system.
8. The deployment engine applies the requested operating system baseline.

9. The deployment engine informs the sequence manager that the operating
system baseline is applied.
10. The sequence manager updates and sends the logs of past steps to the
sequence logger for storage.
11. The sequence manager requests that the deployment engine install the
applications. (There may be more than one application that the sequence manger
provides in its list.)
12. The deployment engine deploys the applications on the provisioned server.
13. The deployment engine informs the sequence manager that the applications
have been installed.
15. The sequence manager requests that the deployment engine apply the
applications configuration baseline.
16. The deployment engine applies the applications configuration baseline.
17. The deployment engine informs the sequence manager that the application
configuration has been applied.
Mechanisms
Cloud Storage Device The cloud storage device provides the storage space
that is needed to host and provision IT resources, in addition to application
baseline information, templates, and scripts.
Hypervisor This mechanism is used to rapidly create, deploy, and host the
virtual servers.
Resource Replication The resource replication mechanism is related to this
pattern in how it is used to generate replicated instances of IT resources in
response to rapid provisioning requirements.
Virtual Server The virtual server may be provisioned or may host
provisioned IT resources.
PLATFORM PROVISIONING
How can cloud consumers build and deploy cloud solutions without the burden
of having to create and manage the underlying infrastructure?
Problem
Even though leasing IT resources offers economical benefits over purchasing
and owning the same IT resources on-premise, organizations often do not see
the benefit in having the on-staff administrative expertise and the overall
responsibilities that come with setting up, configuring, and the on-going
maintenance of raw, leased IT resources, such as those provided by IaaS
platforms.
Solution
A provisioning system is established to deliver ready-made environment
instances (stored as virtual machines) on-demand. Different packages of IT
resources can be bundled into individual ready-made environments, enabling
cloud providers to offer pre-defined and customized PaaS products.
Application
This pattern focuses specifically on the automated provisioning of the readymade environment mechanism, and typically relies on the application of the
Automated Administration (310) and Rapid Provisioning (295) patterns to
establish a system capable of dynamically provisioning auto-deployment

packages on-demand.
Each package is prepared with a ready-made environment that includes a base
operating system and can be further equipped with pre-configured applications,
databases, development tools, and other IT resources. The intelligent
automation engine is utilized to carry out the auto-deployment via customized
scripts. Each variation of offered PaaS services can be published in a service
catalog accessible via the self-service portal implemented as a result of applying
the Self-Provisioning (324) pattern.
1. A cloud consumer logs into a self-service portal and requests the creation of a
new ready-made environment.
2. The self-service portal forwards the request to the automated service
provision.
3. The request platform is located.
3.1. The cloud consumer requests customization to the platform.
3.2. The platform is customized.
4. After several minutes, the platform is provisioned and is made available for
the cloud consumer on the usage and administration portal.
4.1. The customized platform is provisioned and made available on the usage
and administration portal for the cloud consumer.
Figure 7.8 An example of the cloud architecture resulting from the application of the Platform
Provisioning pattern.
Mechanisms
Hypervisor The hypervisor is responsible for hosting the virtual server,
which hosts the development environments or platforms that are provided to
cloud consumers.
Ready-Made Environment Ready-made environments are the primary

platforms provisioned by the system established by this pattern.
Resource Management System This mechanism supplies cloud consumers
with the tools and options they need to manage provisioned platforms.
Resource Replication The resource replication mechanism is used to
replicate the requested platforms (usually from the pre-defined platform
templates).
Virtual Server This mechanism is used to host the provisioned platforms.
BARE-METAL PROVISIONING
How can operating systems be remotely deployed on bare-metal servers?
Problem
The remote provisioning of servers is common because remote management
software is generally a native component of a servers operating system.
However, bare-metal servers do not have pre-installed operating systems (or
any other software) meaning access to conventional remote management

programs is unavailable.
Solution
Most contemporary servers provide the option for remote management support
to be pre-installed in the servers ROM. Some vendors offer this feature only
through an expansion card, while others have the required components already
integrated into the chipset. A bare-metal provisioning system can be designed to
utilize this feature with specialized service agents that can be used to discover
and effectively provision entire operating systems remotely.
Application
The remote management software that is integrated with the servers ROM
becomes available upon server start-up. A Web-based or proprietary user
interface, like the portal provided by the remote administration system
mechanism, is usually used to connect to the servers native remote
management interface. The IP address of the remote management interface can
be configured manually, through the default IP, or alternatively set through the
configuration of a DHCP service. IP addresses in IaaS platforms can be
forwarded directly to cloud consumers so that they can perform bare-metal
operating system installations independently.
Although remote management software is used to enable connections to server
consoles and for the deployment of operating systems, it raises two concerns:
Manual deployment on multiple servers can be vulnerable to inadvertent
human and configuration errors.
Remote management software can be time-intensive and require significant
runtime IT resource processing.
The bare-metal provisioning system addresses these issues via the use of the
following components:
Discovery Agent A type of monitoring agent that searches and finds
available servers that are then assigned to cloud consumers.
Deployment Agent A management agent that is installed into a physical
servers memory to be positioned as a client for the bare-metal provisioning
deployment engine.
Discovery Section A software component that scans the network and locates
available servers with which to connect.
Management Loader The component responsible for connecting to the
server and loading the management options for the cloud consumer.
Deployment Component The feature responsible for installing the operating
system on the selected servers.
The bare-metal provisioning system further provides an auto-deployment

feature that allows cloud consumers to connect to the deployment software and
provision more than one server or operating system at the same time.
The deployment software connects to the servers via their management
interfaces, and uses the same protocol to upload and operate as an agent in the
physical servers RAM, after which the bare-metal server becomes a raw client
with a management agent installed. The deployment software then uploads the
required setup files to deploy the operating system.
Deployment images, operating system deployment automation, or unattended
deployment and post installation configuration scripts can be used via the
intelligent automation engine mechanism and the self-service portal to further
extend this functionality.
The following steps are shown in Figures 7.9 and 7.10:
1. The cloud consumer connects to the deployment solution.
2. The cloud consumer uses the deployment solution to perform a search by
using the discovery agent.
3. The available physical servers are shown to the cloud consumer, who selects
the target server for usage.
4. The deployment agent is loaded to the physical servers RAM via the remote
management system mechanism.
5. The cloud consumer selects an operating system and method of configuration
via the deployment solution.
6. The operating system is installed and the server is operational.
7. The status of the new server is reported to the VIM.
Figure 7.9 A sample cloud architecture resulting from the application of the Bare-Metal
Provisioning pattern (Part I).
Figure 7.10 A sample cloud architecture resulting from the application of the Bare-Metal
Provisioning pattern (Part II).
Mechanisms
Cloud Storage Device This mechanism is used to store operating system
templates and installation files, as well as deployment agents and deployment
packages for the provisioning system.
Hypervisor This pattern can be used to deploy a hypervisor on a physical
server as part of operating system deployments.
Logical Network Perimeter This mechanism is used by the provisioning
system to ensure that raw physical servers can only be accessed by the
appropriate cloud consumers.
Resource Management System This mechanism is pivotal to the application
of this pattern in that it interacts with the deployment agent to load the physical
servers RAM.
Resource Replication This mechanism is implemented to replicate IT
resources by deploying a new hypervisor on a physical box in order to balance
the hypervisor workload during or subsequent to provisioning.
SLA Management System The SLA management system mechanism ensures

that the availability of physical bare-metal servers is in accordance with predefined SLA stipulations.
AUTOMATED ADMINISTRATION
How can common administrative tasks be carried out consistently and

automatically in response to pre-defined events?
Problem
There are numerous administrative and maintenance tasks that need to be
performed on physical servers, virtual servers, and other IT resources. By
default, many of these tasks are performed manually by humans.
Various frequently recurring circumstances at times necessitate the execution of
these tasks to be immediate and on-demand. However, performing certain types
of administrative tasks manually is impractical and inefficient due to the
potential for human error, and the synchronization that is required to
simultaneously carry out the same task across different platforms.
Solution
An automation system that supports multiple connectivity options is created to
run commands and scripts on diverse platforms (Figure 7.11). Different scripts
need to be integrated together to run in a common workflow that uses extra

extensions. This engine may also generate reports on each separate step of the
workflow.
Figure 7.11 The cloud resource administrator defines the workflow logic (1) and expresses it in a
series of scripts that is incorporated into an intelligent automation engine repository (2). The
cloud resource administrator then selects the workflow, the systems it will run on, and its
execution schedule (3). The intelligent automation engine runs the workflow and reports the
results (4).
Application
An automation system, referred to as an intelligent automation engine, is
implemented as a workflow management application that is capable of
executing various scripts. The workflow logic is expressed in scripts via
sequenced steps that are in a pre-determined order with conditional logic.
Conditions pertaining to environmental factors can be defined so that additional
scripts and logic can be automatically triggered when environmental parameters
change.
The intelligent automation engine includes a repository that is used to store
artifacts, such as workflow scripts, log files, and connectivity configurations, as
well as a user interface that allows for the creation and editing of scripting
templates. The engine may further support connections to other system monitors
to integrate monitoring data with script execution.
Intelligent automation engines support a range of common connection methods,
such as SSH, RDP, and RCMD, in addition to various authentication methods.
Other templates are supplied so that different connection methods can be more
easily used.

1. The cloud resource administrator defines the workflow logic.
2. Script execution schedule times can be added while the workflow logic is
being created or at a later point.
3. Existing scripts can be reused and added to the current workflow.
4. Access to the scripts is protected to ensure that they can only be run by
authorized clients.
5. The scripts are ready for use.
6. The intelligent automation engine saves the scripts in its repository.
7. Security credentials for accessing and executing each script can be added.
8. The scripts can be used by the automated service provisioning programs.
9. The scripts are published via the self-service portal and the usage and
administration portal for access and usage by cloud consumers.
Figure 7.12 An overview of how the components can be assembled as a result of the application
of this pattern.
Figure 7.13 depicts sample workflow logic that can be programmed in a script.
Figure 7.13 This scenario depicts a physical server that needs patching, which is a routine task
and a prime candidate for automation. The physical server is part of a cluster, so the script needs
to ensure that the physical server is properly taken offline and monitoring is disabled before
initiating the patching process.
There are circumstances in the patching workflow shown in Figure 7.13 that
will test the ability of the intelligent automation engine to make logical
decisions. For example, the script will need to be programmed with responses to
the following scenarios:
the patch is installed successfully or unsuccessfully
a reboot is required (if the reboot is successful, the engine must have a way to
detect this, and if the reboot is unsuccessful, the engine must log the error)
after the patch is completed, the physical servers status needs to be changed
to online and brought back into the cluster
Scripted workflows can at times require an extended period of time to complete,
which makes handling error conditions more difficult. Additional challenges
that arise when applying this pattern pertain to integrating scripts across
different platforms and systems.
Mechanisms
Automated Scaling Listener The automated scaling listener notifies the
intelligent automation engine when scaling of an IT resource is required.
Cloud Storage Device This mechanism can be used to store data related to
the intelligent automation engine, such as workflow logic and custom scripts.
Cloud Usage Monitor This mechanism is associated with the Automated
Administration pattern for two reasons, the first being that the automated
scaling listener is a variant of the broader infrastructure-level cloud usage
monitor mechanism. A second reason is that the intelligent automation engine
runs workflows that can scale and release current IT resources according to
cloud consumer usage demand.
Hypervisor The intelligent automation engine can pass commands and
workflow logic to the hypervisor to be executed.
Resource Replication Whenever a virtualized instance of an IT resource is
required, the resource replication mechanism may be initiated by the intelligent
automation engine to generate the instance.
Virtual Server The intelligent automation engine either runs a workflow that
sends commands directly to the virtual server for processing, or sends the
commands or workflows to be run by the hypervisor to manage or modify the
virtual server.
CENTRALIZED REMOTE ADMINISTRATION
How can diverse administrative tasks and controls be consolidated for central
remote access by cloud consumers?
Problem
Cloud platforms commonly provide cloud consumers with access to proprietary
administration front-ends and portals for individual IT resources, meaning cloud
providers essentially make out-of-the-box features externally available. Prebuilt administration user interfaces can be sufficient for simpler cloud platforms
and any cloud consumers that only require access to a modest number of IT
resources. However, these user interfaces become inadequate once a greater
number of IT resources need administering, especially by larger cloud consumer
organizations that employ a number of cloud resource administrators.
Inconsistencies in the presentation of administrative controls and features and
the processes they require can lead to human error and recurring inefficiencies
as cloud resource administrators are required to learn how to perform the same
tasks using different tools.
In the example illustrated in Figure 7.14, the cloud consumer wants to monitor
the usage of IT resources that are allocated to each branch of its organization.
The cloud consumer also requires the option of providing each branch manager
with control over the IT resources at its own branch. Security and administrative
risks are introduced if branch managers were provided with the same level of
access as the cloud consumer that established the IT environment.
Figure 7.14 Cloud Consumer A leases an IaaS platform from a cloud provider (1) with the
intention of offering its own PaaS platform to other cloud consumers (thereby assuming the role
of a cloud provider). After the new PaaS platform is made available by Cloud Consumer A,
Cloud Consumers B and C lease instances of the platform (2). Cloud Consumer A (acting as a
cloud provider) needs a means of offering management features and usage tracking and reporting
of the various IT resources that are available via the PaaS platform, while ensuring that each
cloud consumer is granted an appropriate level of control.
Solution
A custom usage and administration portal can be created to support different
levels of security access, while consolidating the administrative functions of a
range of IT resources for consistent and standardized presentation (Figure 7.15).
Figure 7.15 Cloud Consumers B and C can access and manage their provisioned IT resources
using the usage and administration portal.
Application
The usage and administration portal generally provides two broad sets of
features: management controls and reporting. Management controls consolidate
similar IT resource management functions into standardized front-end controls
presented to the cloud resource administrator. Reporting features can also
consolidate usage data from multiple IT resources into summarized analysis
reports and realtime dashboard statistics. Single sign-on technology is
commonly used to enable cloud resource administrator credentials to propagate
the authorization and authentication of all affected, underlying IT resources.
Unless the cloud provider chooses to build the usage and administration portal
from scratch, the remote administration system mechanism is most commonly
used as the main component around which the portals architecture is built. The
mechanism is then further integrated with various back-end management
systems and API-enabled IT resources.
This pattern is commonly combined with the Self-Provisioning (324) pattern to
further extend the feature-set of the centralized portal, as well as the Broad
Access (93) pattern to enable the portal to support access from multiple devices
and protocols.
Mechanisms
Audit Monitor The audit monitor is associated with this pattern in how it
monitors cloud consumer usage to log IT resource access, as well as
information about the cloud consumers themselves (such as their geographic
locations).
Billing Management System The billing management system produces and
generates the IT resource usage cost and chargeback information, which may be
streamed or published on the usage and administration portal for cloud
consumer viewing.
Cloud Usage Monitor The cloud usage monitor collects usage information
about cloud services and IT resources managed via the usage and administration
portal and may also monitor the usage of the administration portal itself.
Logical Network Perimeter This mechanism creates a logical isolation that
separates each cloud consumers management and usage tools and reports, to
prevent viewing and access by other unauthorized cloud consumers.
Multi-Device Broker The application of the multi-device broker provides the
features and tools that allow cloud consumers to use different devices running
different operating systems to connect to the usage and administration portal.
Pay-Per-Use Monitor The pay-per-use monitor gathers IT resource usage
information to be used by the billing management system. This billing
information may be provided in a realtime report on the usage and

Remote Administration System This mechanism provides fundamental
technologies, APIs, and templates used for the creation and configuration of
usage and administration portals.
Resource Management System The resource management system provides
tools and management options necessary for cloud consumers to manage IT
resources and is generally integrated with and abstracted by the usage and
SLA Monitor The SLA monitor relates to this pattern by supplying the
runtime usage data relevant to SLA-based reports that may be published on the
usage and administration portal for cloud consumer viewing, as per the
Realtime Resource Availability (292) pattern.
RESOURCE MANAGEMENT
How can a cloud consumer safely manage an IT resource without impacting

neighboring IT resources?
Problem
When a cloud consumer carries out management tasks on an IT resource,

neighboring IT resources (belonging to the same or different cloud consumer)
can be inadvertently impacted.
For example, the logical network perimeter established for one cloud consumer
may encompass IT resources that are shared by other cloud consumers. This
means the same physical server may be hosting virtual servers that belong in
different logical network perimeters. In Figure 7.16, all IT resources belong to
the same cloud consumer.
Figure 7.16 In this example, the cloud consumer makes a remote management change to a
physical server, which accidentally affects a virtual server hosting a database in another part of
the cloud environment. In this scenario, all IT resources belong to the same cloud consumer.
Solution
A set of tools and back-end controls are provided by the cloud provider to
specifically limit the access levels and management options of each cloud
consumer to the IT resources for which it is granted access.
Application
This pattern is applied via front-end portal controls and corresponding back-end
scripts and logic, and is therefore typically combined with the Centralized
Remote Administration (315) pattern. The controls established by this pattern
essentially confine each cloud consumers access to within its designated logical
network perimeter and further enforce the levels of access the cloud consumer
has to IT resources within the perimeter.
The tools established by this pattern can further include a sandbox environment
that allows cloud consumers to safely test and execute management changes
before committing the changes to the production environment. The sandbox
environment limits the amount of access cloud consumers have to physical
resources, and also allows for the monitoring of commands and configuration
requests (Figure 7.17). It provides two key features:
1. An auditing system is put in place to audit commands and requests prior to
passing them to actual IT resources. This way, any conflicts or
misconfigurations can be detected and notified to the cloud consumer before
they are applied to the production environment.
2. Log files are maintained to keep a record of all commands and requests made.
This can aid troubleshooting.
Figure 7.17 Cross-IT resource management tools and logic are used to check (and optionally
audit and log) commands before allowing them to be executed.
Mechanisms
Audit Monitor This mechanism is responsible for auditing resource
management activity for security and legal reasons.
Cloud Usage Monitor Cloud usage monitors may be used to track usage
information relevant to the system created by the application of this pattern.
Logical Network Perimeter This mechanism isolates the IT resource access
and management paths for cloud consumers, in order to provide a level of
isolation that prevents cloud consumers from accessing the IT resources of
others.
Remote Administration System To enable remote access to resource
management features, the remote administration system enables the creation of
custom portals and front-ends.
Resource Management System This mechanism provides cloud consumers

with the options, tools, and access permissions that they require to manage the
provision IT resources.
SELF-PROVISIONING
How can cloud consumers be empowered to have IT resources provisioned ondemand?
Problem
A cloud provider may require that a cloud consumer interact with sales staff to
have new IT resources provisioned or, subsequent to receiving the provisioning
request, an approval process may be required and cloud resource administrators
may further have to manually perform the provisioning. These types of
processes can unreasonably prolong the time it takes for a cloud consumer to
gain access to the required IT resources and can further demand extra effort and
communication from the cloud consumer organization.
A burdensome provisioning experience can make cloud consumers wary of
further transactions with the cloud provider and can inhibit the cloud consumer
organizations overall ability to be responsive to fulfilling their own business

automation requirements.
Solution
The cloud provider makes a self-service portal available that provides cloud
consumers with a live, up-to-date list of available cloud services and IT
resources that can be automatically provisioned after the cloud consumer
submits the request online.
Some cloud providers will still require a human-driven approval process that is
carried out upon receiving a provisioning request via a self-service portal.
However, this process is often expedited so that approved requests are fulfilled
within hours instead of days.
Application
The Self-Provisioning pattern can be applied together with the Centralized
Remote Administration (315) pattern to establish a sophisticated consumerfacing front-end comprised of a combination of the features of the usage and
administration portal and the self-service portal. The respective portals can still
be displayed independently but by standardizing both, they can be integrated as
part of the same overall Web application to ensure a consistent experience for
consumer-side cloud resource administrators.
1. The cloud consumer connects to the self-service portal, established by the
Self-Provisioning pattern, via a multi-device broker that provides accessible
connectivity to this cloud consumer and others that may need to connect with
different devices.
2. The cloud consumer selects the desired cloud service from an inventory of
services listed and described in a service catalog published on the self-service
portal.
3. The selected cloud service is provisioned.
4. The provisioned cloud service is published to the usage and administration
portal, established by the Centralized Remote Administration (315) pattern,
making it available for management by the cloud consumer.
5. The cloud consumer can use tools published on the usage and administration
portal to manage the cloud service implementation.
Figure 7.18 A simple cloud architecture in which both the self-service portal and usage and
administration portal play roles in relation to how cloud services are provisioned online.
The self-service portal needs to be integrated with whatever separate approval

process a cloud provider requires, along with the security system used to grant
different levels of access and control. Cloud consumers are typically organized
into access groups and granted service provisioning permissions based on the
outcome of the approval process or prior profile information. Users who then
log into the self-service portal on behalf of a cloud consumer organization will
only be able to view and request from a list of IT resources that corresponds to
their permission level.
Figures 7.19 and 7.20 illustrate the common steps that are required in order to
navigate the permission approval process of a self-service portal.
Figure 7.19 Common steps required to navigate the permission approval process of a self-service
portal (Part I).
Figure 7.20 Common steps required to navigate the permission approval process of a self-service
portal (Part II).
Mechanisms
Audit Monitor Auditing of self-service portal usage is required when
information about the cloud consumers and their geographical locations or
access points needs to be collected.
Cloud Usage Monitor Specialized cloud usage monitors may be employed to
collect data of how self-service portal features are used.
Logical Network Perimeter The logical network perimeter isolates the
options made available via a given instance of a self-service portal as they are
offered to each cloud consumer.
Multi-Device Broker This mechanism is primarily utilized to broaden the
access to the self-service portal via different types of cloud service consumer
devices.
Remote Administration System The self-service portal that results from the
application of this pattern may rely heavily on the tools and back-end interfaces
provided by this mechanism.
POWER CONSUMPTION REDUCTION
How can a hypervisors resources be guaranteed to be used efficiently to

minimize data center power and cooling costs?
Problem
Figure 7.21 shows a cloud environment with four hypervisors participating in a
hypervisor cluster. For simplicity, all of the virtual servers have the same virtual
memory and virtual CPU configuration specifications. Each hypervisor can run
eight virtual servers using 70% of its capacity, while the remaining 30%
capacity is kept available for administration tasks like backups. Only two
hypervisors are required to fully run the environment. The remaining two
hypervisors are powered on unnecessarily and are consuming extra power,
creating excess heat, and require UPS and cooling. A solution that can place the
unnecessary hypervisors into a shutdown or standby mode and bring them back
into operation on demand is required.
Figure 7.21 The hypervisors in this hypervisor cluster can each host eight virtual servers.
Solution
The capacity of the hypervisors is first evaluated before selecting the
hypervisors that are to remain powered on to host the virtual servers. The virtual
servers are moved to the selected hosts and the remaining hypervisors enter into
standby mode. When the operational hosts capacity is close to being exceeded,
more hosts are called to leave standby mode and become operational as well.
Applying this solution can bring substantial cost savings, depending on the size
and design of the cloud environment.
Application
The maximum utilization of each host is 70%, and the two hosts are capable of
running all of the virtual servers without reaching their capacity limit of 70%. In
some situations, an extra host may be kept powered on to meet the level of
availability that is required by cloud consumers or applications. Bringing the
host back into operation from standby mode may take several minutes, which
certain SLAs may not be able to accommodate. Figure 7.21 illustrates the
scenario prior to applying the pattern, while Figures 7.22 to 7.24 illustrate the
steps involved in the application of this pattern.
Figure 7.22 The Power Consumption Reduction pattern is applied (Part I).
Figure 7.23 The Power Consumption Reduction pattern is applied (Part II).
Figure 7.24 The Power Consumption Reduction pattern is applied (Part III).
If the Load Balanced Virtual Server Instances (51) pattern has been applied, the
workload will be balanced between the hosts that are currently operational. An
outage may result if there is no hot standby host to remain powered on,
especially if demand is increasing or a host has abruptly failed. Any given
virtual server may become shut down or remain powered off after a hypervisor
failure, before the standby host becomes operational.
The following steps are shown in Figures 7.22 to 7.24:
1. A specialized capacity monitoring service agent monitors the capacity and
workload of the hosts.
2. The monitoring results are sent to the VIM server, or sent to a capacity
advisor application that forwards the results to the VIM server.
3. The VIM server initiates the workload movement via the application of the
Non-Disruptive Service Relocation pattern.
4. The virtual servers are moved to the hosts that have been selected to stay
operational.
5. The other hosts go into standby mode upon being signaled by the VIM server.
6. The capacity monitoring agent continues to monitor the workload on the
hosts.
7. The capacity monitor signals the VIM server whenever the hosts utilization
nears 70%.
8. The VIM server signals one of the hosts to come out of standby mode via the
use of the wake-on LAN (WOL).
9. After the host becomes operational, the system established by the application
of the Non-Disruptive Service Relocation pattern moves some of the virtual
servers to the host that has been powered back on.
Mechanisms
Hypervisor This mechanism is used to host virtual servers.
Live VM Migration If virtual servers need to be evacuated from their host
hypervisor so the host can be placed into standby mode, this mechanism is used
to migrate the virtual servers to a different host.
Virtual Infrastructure Manager (VIM) This mechanism is used to configure
power consumption policies and thresholds, identify which hosts can be placed
into standby mode, and bring hosts out of standby mode when required.
Virtualization Monitor This mechanism actively monitors resource and
power utilization, and can be used to notify system administrators if certain
resource or power utilization thresholds are met.

Cloud Monitoring, Provisioning and Administration Patterns

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Cloud Monitoring, Provisioning and Administration Patterns

Caricato da

Copyright:

Formati disponibili

Chapter 7.

Monitoring, Provisioning and

How can IT resource usage be measured?

Data Saver A middleware component used to save and update collected

3. The pay-per-use monitor sends usage data to a usage monitor station.

A human-readable report of realtime usage is published on the usage and

Figure 7.4 An example of a cloud architecture comprised of pay-per-usage and billing

REALTIME RESOURCE AVAILABILITY

In the absence of a usage and administration portal, a separate service

How can the provisioning of IT resources be automated and made available to

of IT resources, manual provisioning processes are inadequate and can even

Sequence Logger A component that logs the execution of automated

The preceding example is significantly simplified. The following step-by-step

8. The deployment engine applies the requested operating system baseline.

establish a system capable of dynamically provisioning auto-deployment

Ready-Made Environment Ready-made environments are the primary

How can operating systems be remotely deployed on bare-metal servers?

any other software) meaning access to conventional remote management

The bare-metal provisioning system further provides an auto-deployment

SLA Management System The SLA management system mechanism ensures

How can common administrative tasks be carried out consistently and

need to be integrated together to run in a common workflow that uses extra

The following steps are shown in Figure 7.12:

CENTRALIZED REMOTE ADMINISTRATION

information may be provided in a realtime report on the usage and

How can a cloud consumer safely manage an IT resource without impacting

When a cloud consumer carries out management tasks on an IT resource,

Resource Management System This mechanism provides cloud consumers

How can cloud consumers be empowered to have IT resources provisioned ondemand?

organizations overall ability to be responsive to fulfilling their own business

The self-service portal needs to be integrated with whatever separate approval

POWER CONSUMPTION REDUCTION

How can a hypervisors resources be guaranteed to be used efficiently to

Potrebbero piacerti anche