Sei sulla pagina 1di 37

Virtualization

What is it?.

Virtualization
is the pooling and abstraction of
resources and services in a way that
masks the physical nature and
boundaries of those resources and
services from their users
http://www.gartner.com/DisplayDocument?id=399577
Virtualization is … well, not exactly new
 Nothing new! Concept known to mainframes back in the ’70s
Virtualization is not a new concept
Mainframe of the ‘70s were underutilized and over-engineered

http://www-07.ibm.com/systems/my/z/about/timeline/1970/
Mainframe Virtualization:

Concept: split the computer into multiple virtual machines so different “tasks” can
be run separately and independently on the same mainframe.
If one virtual machine or “task” fails, other virtual machines are unaffected

VM #1 VM #2 VM #3 VM #4 VM #5 VM #6 VM #7
Task A Task B Task C Task D Task E Task F Task G

Logical VMs on a mainframe


Fast Forward to the 1990s
 Computers in the 1990s
Intel/AMD servers now very popular (known as “x86” servers)
Each server runs one Operating Systems such as Windows, Linux, etc.
Typical: one OS and one application per server
Server sprawl inevitable
Power, cooling, rackspace become problematic

File File
File Web Server Web Domain Server
Server Server Server Server
DNS Each Server Running
App
Server Server 1 Application
IT Challenges
 Server Sprawl
Power, space and cooling: one of the largest IT budget line items
One-application-per-server: high costs (equipment and
administration)
 Low Server and Infrastructure Utilization Rates
Result in excessive acquisition and maintenance costs
 High business continuity costs
HA & DR solutions built around hardware are very expensive
 Ability to respond to business needs is hampered
Provisioning new applications often a tedious process
 Securing environments
Security often accomplished through physical isolation: costly
Virtualization is the Key
Apply Mainframe Virtualization Concepts to x86 Servers:
 Use virtualization software to partition an Intel / AMD server to
work with several operating system and application “instances”
Database Web Application Servers Email File Print DNS LDAP

Deploy several “virtual machines”


on one server using virtualization
software
Four Drivers Behind Virtualization

Hardware Data Centers are Rising Energy Administration


Resources running out of Costs Costs are
Underutilized space Increasing
• As much as 50%
of the IT budget • Number of
• CPU utilizations ~ • Last 10+ years of
• In the realm of operators going
10% - 25% major server sprawl
the CFO and up
• One server – One • Exponential data
Facilities Mgr. • Number of
Application growth
now! Management
• Multi-core even • Server consolidation Applications
more under-utilized projects just a start going up

Operational Flexibility
Other Significant Virtualization Benefits
 Some key benefits:
Ability to quickly spawn test and development environments
Provides failover capabilities to applications that can’t do it natively
Maximizes utilization of resources (compute & I/O capacity)
Server portability (migrate a server from one host to the other)

 Virtualization is not limited to servers


and OS
Network virtualization
Storage virtualization
Application virtualization
Desktop virtualization
Compute Resources Virtualization
Virtual Machines

So what exactly is a virtual machine?


 A virtual machine is defined as a representation of a physical machine by
software that has its own set of virtual hardware upon which an operating
system and applications can be loaded. With virtualization each virtual
machine is provided with consistent virtual hardware regardless of the
underlying physical hardware that the host server is running. When you
create a VM a default set of virtual hardware is given to it. You can further
customize a VM by adding or removing additional virtual hardware as
needed by editing its configuration.
Virtual Machines can provide you …

Hardware independence – VM sees the same


hardware instantiation regardless of the host
hardware underneath

Isolation – VM’s operating system is isolated and


independent from the host operating system and also
from the adjacent Virtual Machines.
Hypervisor

What is a hypervisor?
 A hypervisor, also called a virtual machine manager (VMM), is a
program that allows multiple operating systems to share a single
hardware host. Each operating system appears to have the host's
processor, memory, and other resources all to itself. However, the
hypervisor is actually controlling the host processor and resources,
allocating what is needed to each operating system in turn and
making sure that the guest operating systems (called virtual
machines) cannot disrupt each other.
It's all about Rings
 x86 CPUs provide a range of protection levels also known as rings in
which code can execute. Ring 0 has the highest level privilege and is
where the operating system kernel normally runs. Code executing in
Ring 0 is said to be running in system space, kernel mode or supervisor
mode. All other code such as applications running on the operating
system operate in less privileged rings, typically Ring 3.
Rings in virtualization
Traditional systems
Operating system runs in privileged mode in Ring 0 and
owns the hardware
Applications run in Ring 3 with less privileges

Virtualized systems
VMM runs in privileged mode in Ring 0
Guest OS inside VMs are fooled into thinking they are
running in Ring 0, privileged instructions are trapped and
emulated by the VMM
Newer CPUs (AMD-V/Intel-VT) use a new privilege level
called Ring -1 for the VMM to reside allowing for better
performance as the VMM no longer needs to fool the Guest
OS that it is running in Ring 0.
Typical Virtualization Architectures
Hardware Partitioning Dedicated Hypervisor Hosted Hypervisor

Apps Apps Apps Apps Apps Apps


... ... ...
OS OS OS OS OS OS

Adjustable
partitions Hypervisor
Partition Hypervisor
Controller Host OS

Server Server Server

Server is subdivided into fractions Hypervisor provides fine-grained Hypervisor uses OS services to
each of which can run an OS timesharing of all resources do timesharing of all resources

Physical partitioning Hypervisor software/firmware Hypervisor software runs on


IBM S/370, Sun Domains, runs directly on server a host operating system
HP nPartitions
VMware ESX Server VMware Server
Logical partitioning
Xen Hypervisor , KVM Microsoft Virtual Server
System p LPAR, HP vPartitions, Microsoft Hyper-V HP Integrity VM
Sun Logical Domains Oracle VM QEMU
IBM System z LPAR
Server virtualization architecture
examples
VMware ESX Architecture
CPU is controlled by scheduler
File
and virtualized by the monitor
TCP/IP System
Guest Guest

Monitor supports:
•BT (Binary Translation)
Monitor Monitor (BT, HW, PV)
•HW (Hardware Assist)
Virtual NIC Virtual SCSI
•PV (Paravirtualization)
Memory
Scheduler Allocator
Virtual Switch File System

Memory is allocated by the NIC Drivers I/O Drivers


VMkernel and virtualized VMkernel
by the Monitor
Physical
Hardware
Network and I/O devices are
emulated and proxied through
native device drivers

http://www.vmware.com/products/vsphere/
Xen 3.0 Architecture

http://www.citrix.com/English/ps2/products/feature.asp?contentID=1686939
Evolution of Virtualization

Going from Here…


App App App App App App App App

X86 X86
X86 X86
Windows Windows
Suse Red Hat
XP 2003

12% Hardware 15% Hardware 18% Hardware 10% Hardware


Utilization Utilization Utilization Utilization
… to There
App. A App. B App. C App. D

X86 X86 X86 X86


Windows Windows Suse Red Hat
XP 2003 Linux Linux

x86 Multi-Core, Multi Processor

70% Hardware Utilization

Guest OS

Host OS Virtual machine monitor


Virtualization Requirements

 1974 text from Popek and Goldberg: “For any computer


a virtual machine monitor may be constructed if the set
of sensitive instructions for that computer is a subset of
the set of privileged instructions”
 Complicated way of saying that the virtual machine
monitor needs a way of determining when a guest
executes sensitive instructions.

http://portal.acm.org/citation.cfm?doid=361011.361073
x86 Virtualization Challenges

 The IA-32 (x86) instruction set contains 17 sensitive,


unprivileged instructions that do not trap
Sensitive register instructions: read or
change sensitive registers and/or memory
locations such as a clock register or
interrupt registers:
SGDT, SIDT, SLDT, SMSW, PUSHF,
POPF, etc.
 The x86 fails the Popek-Golberg test!
 Keep in mind x86 OS are designed
to have full control over the
entire system
 However, massive economic interest in making it work
Virtualizing the x86 Processor: possible!

Recipe for x86 virtualization:


 Non-sensitive, non-protected instructions
can be executed directly
 Sensitive privileged instructions must trap
 Sensitive non-privileged instructions
must be detected
Several Ways to Virtualize an OS
 Container-based
OpenVZ, Linux VServer

 Host-based (Type-2 Hypervisors)


Microsoft Virtual Server, VMware
Server and Workstation

 Paravirtualization
Xen, [Microsoft Hyper-V], some VMware
ESX device drivers

 Full virtualization (Type-1 Hypervisors)


VMware ESX, Linux KVM, Microsoft Hyper-V, Xen 3.0
Container-Based Virtualization
 Virtual Machine Monitor (VMM) inside a Patched Host OS (Kernel)
 VMs fully isolated
Host VM VM 1 VM n
 Host OS modified to isolate different VMs
Example: Kernel data structure changed
Privileged Applications Applications
to add context ID to differentiate between
VM Admin
identical uids between different VMs
Thus VMs isolated from each other in kernel
VMM
 No full guest OS inside a container Shared Host OS Image

 Fault isolation not possible (OS crash) CPU Memory IO Disk

Hardware
 Applications/users see container VM as a
virtual host/server
 VMs can be booted/shut down like regular OS
 Systems
Linux VServer, OpenVZ
Host-Based Virtualization (Type-2)

 Host OS (Windows, Linux)


 VMM inside the Host OS
Application Application
Kernel-mode driver
OS 1 OS 2
 Multiple Guest OS support
VMM

 VMM emulates hardware for guest OSs


Host OS
Hardware

 Systems
Microsoft Virtual Server, VMware Workstation & Server
Host OS: XP, 2003, Linux
Guest OS: NT, 2000, 2003, Linux
Para-Virtualization

 VMM runs on “bare metal”


 Guest OS modified to make calls (“hypercalls”)
to or receive events from VMM
 Support of arbitrary guest OS not possible VM 1 VM 2

Either modified OS or modified device drivers Applications Applications

 Few thousand lines of code change Modified Modified


Guest OS 1 Guest OS n

 Open-source OS modification easy VMM

 Systems Disk
CPU Memory IO

Xen Hardware

Guest OS: XenoLinux, NetBSD, FreeBSD,


Solaris 10, Windows (in progress)
Native/Full Virtualization (Type-1)

 VMM runs on ‘bare metal’


 VMM virtualizes (emulates) hardware
Virtualizes x86 ISA, for example

 Guest OS unmodified
 VMs: Guest OS+Applications run
under the control of VMM
 Examples
VMware ESX, Microsoft Hyper-V
IBM z/VM
Linux KVM (Kernel VM)
A Closer Look at VMware’s ESX™

 Full virtualization
Runs on bare metal
Referred to as ‘Type-1 Hypervisor’

 ESX is the OS (and of course the VMM)


 ESX handles privileged executions from Guest kernels
Emulates hardware when appropriate

 Uses ‘Trap and Emulate’ and ‘Binary Translation’


 Guest OS run as if it were business as usual
Except they really run in user mode (including their kernels)
ESX Architecture

VMMs
(virtual hardware)

ESX Kernel

© http://www.vmware.com
Privileged Instruction Execution

 ESX employs trap-and-emulate VM 1 VM 2

to execute Priv. Inst. on behalf of Ring 3


Applications Applications

Guest OS
Guest OS 1 Guest OS n
Ring 1 or 3
Keep Shadow copies of Guest LGDT 0x00007002

OS’s Data Structures (states) GDT


Ring 0
Trap!
VMM
Guest OS Traps to VMM
Shadow GDT

VMM emulates the instruction


CPU Memory IO Disk
VMM Updates or copies the required Hardware
states of Guest OS
Emulation works like exception handler
Sensitive Instruction Execution
 Sensitive instructions (SI) do not trap VM 1 VM 2

 ESX intercepts execution of SI Applications Applications


Ring 3
Binary Translation (BT): Rewrite
Guest OS instructions Guest OS 1 Guest OS n
Ring 1 or 3
Binary code (Hex stream of x86 Rewritten
POPF
instructions) of guest OS rewritten to IF
insert proper code Ring 0
Inserted
VMM
No modification to Guest OS – ESX does it Trap

on the fly! Write IF

Eg: rewrite POPF (modify interrupt flag)


so it traps CPU Memory IO Disk

Hardware

popf  int $99


; invoke syscall handler
void popf_handler(int vm_num, regs_t *regs)
{
regs->eflags = regs->esp;
regs->esp++;
}
Today: Virtualization-Friendly x86!

 Recent processors include virtualization extensions to


circumvent the original x86 virtualization unfriendliness
Intel’s VT technology (VT-x, VT-i, VT-d)
AMD-V or Pacifica

 Extensions give VMM many conditions under which the


actions attempted by a Guest VM get trapped
 Note that due to performance issue (traps are CPU
expensive), some Type-1 Hypervisors actually do not
leverage all VT extensions and prefer another software
mechanism called Binary Translation
A quick word on VT extensions usage

 ESX does not leverage all VT extensions by default


 VMware spent years fine tuning binary translation
 ESX requires Intel-VT to support 64-bit guests. Intel
removed some memory protection logic using standard
x86 instructions. To achieve the same result for 64-bit
guests ESX needs some Intel-VT instructions.
 vSphere can leverage VT extensions on a per VM basis

http://communities.vmware.com/docs/DOC-9150
http://www.vmware.com/files/pdf/vsphere_performance_wp.pdf
What About Networking?
 Users naturally expect VMs to have access to network
 VMs don’t directly control networking hardware
x86 hw designed to be handled by only one device driver!

 When a VM communicates with the outside world, it:


… passes the packet to its local device driver …
… which in turns hands it to the virtual I/O stack …
… which in turns passes it to the physical NIC

 ESX gives VMs several device driver options:


Strict emulation of Intel’s e1000
Strict emulation of AMD’s PCnet 32 Lance
VMware vmxnet: paravirtualized!

 VMs have MAC addresses that appear on the wire


LAN Switching Challenge!

 Suppose VM_A and VM_B need to communicate


 They are on the same VLAN and subnet

Physical switch

Physical link

Hypervisor

VM A VM B
MAC address A MAC address B

Potrebbero piacerti anche