Sei sulla pagina 1di 53

VMware Performance Troubleshooting

Presented by Chris Kranz

Topics Covered
Introduction Root Cause Analysis Performance Characteristics CPU Networking Memory Disk Virtual Machine optimisation ESXTop vm-support Service Console Resource Groups Design Guidelines Capacity Planner limitations and cautions Conclusion Reference Articles

Introduction
Multiple layers of virtualisation are used to increase service levels, availability and manageability However, multiple layers of virtualisation often mask performance and configuration issues making it more of a challenge to troubleshoot and correct The worst out come is that performance issues after a virtualisation project lead to the perception that VMware results in reduced performance and future confidence in VMware can be affected

Performance Basics

Virtual Machine Resources


CPU Memory Disk Networking

Resource Maximums
Host Logical Processors
Virtual CPUs

Guest N/A
8

64
N/A

Virtual CPUs per Core Memory

20 1TB

N/A 256GB

http://www.vmware.com/pdf/vsphere4/r40/vsp_40_config_max.pdf

Typical Host
vSphere 1U Host CPUs Memory 2 x Quad Core 32-64GB RAM

Typical 3 VMs per core, 24VMs per Host Each has 2GB of RAM = 48GB of RAM

Root Cause Analysis

http://www.vmware.com/resources/techresources/10066

Root Cause ...

Monitoring Performance Do not rely on guest tools, but


Can show high CPU, & Memory Utilisation Measurement of Latency & throughput of Disk & Network Interfaces

Use the virtualisation layer, to diagnose cause:


Guest is unaware of virtualisation workload The way in which guest OSs account time is different No visibility of available resources

Performance Analysis Tools esxtop (service console only) resxtop (remote command line utilities) Performance graphs in vCentre

esxtop esxtop can be run:


Interactively Batch (eg. esxtop -a -b > analysis.csv) Load batch into windows perfmon or MS Excel

Two keys to remember


H : help F : fields to display

esxtop basics

Host Resources

Name of Resource Pool, Virtual Machine or World

Number of Worlds

Performance Characteristics

CPU
Slow Processing High CPU Wait

Memory
Slow Processing Disk Swapping

Networking
Packet Loss Slow Network

Disk
Log Stalls Disk Queue

Slow Application Performance Reduced User Experience Data Loss and Corruption

CPU

ESX Scheduler
Basic World States Read / Run / Wait

Service Console

Virtual Machine

CPU States Ready / Usage / Wait

Limits / Shares / Reservations

esxtop

CPU

High %RDY + High %User can imply over commitment

PCPU(%): CPU utilization %USED: Utilization %RDY: Ready Time %RUN: Run Time %WAIT: Wait and idling time

VI-Client

CPU

Used Time > Ready Time: Possible CPU over-committment Used Time

Ready Time

Further Investigation
%MLMTD shows this VM has been limited

CPU

VMware Memory Management


Transparent Page Sharing VMware Tools Balloon Driver to force the VM to swap to disk Virtual Machine Page File

Memory
Ballooning vs. Swapping
Ballooning driver causes the host to swap pages that it chooses to disk ESX Swapping will swap any pages to disk.

Memory Ballooning can be disabled (0 value) or controlled on a per Virtual Machine basis using:
sched.mem.maxmemctl

Default is set to 65%, can be controlled at host level. Only is an issue in resource contention scenarios. (or VMs with low latency eg Citrix)

Memory - Host
VI Client shows memory usage of the host. This is calculated as consumed + overhead memory + Service Console. Performance charts are a very good way of showing the Virtual Machine memory breakdown. Consumed Memory Ballooned Memory Shared Memory Swapped Memory

Memory - Guest
Host Memory = Consumed + Overhead Memory Guest Memory = Active Memory for Guest OS

Memory Guest Overhead

Memory
Metric
Memory Active (KB)

Virtual Machine Memory Metrics VI Client


Description
Physical pages touched recently by a VM

Memory Usage (%) Memory Consumed (KB) Memory Granted (KB)

Active memory / configured memory Machine memory mapped to a virtual machine, including its portion of shared pages. Doesnt include overhead memory Physical pages allocated to a virtual machine. May be less than configured memory. Includes shared pages. Doesnt include overhead memory. Physical pages shared with other virtual machines Physical memory ballooned from a virtual machine Physical memory in swap file (approx. swap out swap in). Swap out and Swap in are cumulative Machine pages used for virtualisation

Memory Shared (KB) Memory Balloon (KB) Memory Swapped (KB) Overhead Memory (KB)

Memory
Metric Memory Active (KB) Memory Usage (%) Memory Consumed (KB) Memory Granted (KB) Memory Shared (KB) Shared Common (KB) Memory Balloon (KB) Memory Swap Used (KB) Overhead Memory (KB) Description

Host Memory Metrics VI Client

Physical pages touched recently by the host Active memory / configured memory Total host physical memory free memory on host. Includes Overhead and Service Console memory
Sum of physical pages allocated to all virtual machines. Doesnt include overhead memory.

Physical pages shared by virtual machines on host


Total machine pages used by shared pages

Machine pages ballooned from virtual machines Physical memory in swap file (approx. swap out swap in). Swap out and Swap in are cumulative Machine pages used for virtualisation

esxtop

Memory

PMEM: Total physical memory breakdown VMKMEM: Memory managed by vmkernel COSMEM: Service Console memory breakdown PSHARE: Page sharing statistics SWAP: Swap statistics MEMCTL: Balloon driver data

Memory

esxtop / VI Client metrics : Virtual Machines

VI Client Active Memory Memory Usage Consumed Memory Memory Granted Memory Shared Memory Balloon Memory Swapped Overhead Memory

esxtop TCHD %ACTV N/A N/A (SZTGT and CMTTGT represent memory scheduler targets) SHRD (+SHRDSVD per VM). Must enable COW stats in ESXTOP MCTLSZ SWCUR (SWR/s & SWW/s are rates) OVHD & OVHDMAX

Memory

esxtop / VI Client metrics : Host Usage

VI Client Memory Active Memory Usage Memory Consumed Memory Granted Memory Shared Memory Shared Common Memory Balloon Memory Swap Used Overhead Memory

esxtop N/A (try /proc/vmware/sched/mem-verbose) N/A (try /proc/vmware/sched/mem-verbose) PMEM total PMEM free N/A (SZTGT and CMTTGT represent memory scheduler targets) PSHARE (shared) PSHARE (common) MEMCTL SWAP (r/w and w/s are rates) OVHD & OVHDMAX

Memory

VI Client memory usage graph

Memory

Troubleshooting Memory usage issues

Networking
Switch Assisted Teaming (IP Hash) VLAN Trunking Flow Control (full) Speed & Duplex (1000Mb / Full) Port Fast BPDU Disabled STP Disabled Link State Tracking Jumbo Frames
Network configuration is more likely to blame than resource contention

esxtop

Networking
Transmit and Receive in Mb/s Transmit and Receive in Packets

esxtop

Networking

Dropped Packets Transmit

Drop Packets Received

Disk
Varying Factors File system performance Disk subsystem configuration (SAN, NAS, iSCSI, local disk) Disk caching Disk formats (thick, sparse, thin)

ESX Storage Stack Different latencies for different disks Queuing within the kernel
K: Kernel D: Device G: Guest

Disk

VI Client statistics

Quite Coarse Statistics Disk read / write rate (KB/s) Disk usage: sum of read BW and write BW (KB/s) Disk read / write requests (per 20s interval) Bus resets / Command aborts (per 20s interval) Per LUN or aggregated stats

Disk

Aggregated stats similar to VI Client Disk read / write per sec (READS/s, WRITES/s) MB read / write per sec (MBREAD/s, MBWRTN/s) Latency Statistics Kernel Average / command (KAVG/cmd) Device Average / command (DAVG/cmd) Guest Average / command (GAVG/cmd) Queuing Information Adapter Queue Length (AQLEN) LUN Queue Length (LQLEN) VMKernel (QUED) Active Queue (ACTV) %Used (%USD = ACTV/LQLEN)

esxtop statistics

Disk
SAN Rough Estimates Purely looking at a single ESX host, roughly:
Throughput (in MBps) = (Outstanding IOs * Block size in KB) / latency in msec

FC, rough maximums:


Effective Link Bandwidth = ~80/90% of Real Bandwidth Effective (2Gbps) = 200 230 MBps Effective (4Gbps) = 410 460 MBps Effective (8Gbps) = 820 920 MBps

iSCSI / NFS / FCoE, rough maximums:


Effective Link Bandwidth = ~70/80% of Real Bandwidth Effective (1GigE) = 90 100 MBps Effective (10GigE) = 900 1000 MBps

Disk
Desired Latency Calculations
Desired Larency in msec <= (Outstanding IOs * Block size in KB) / Throughput per host Example: Number of Hosts: 16 Effective Link Bandwidth: 90 MBps Throughput per host: 90 / 16 = 5.6 MBps Desired Latency: (32 * 32) / (5.6) = 182.86 msec

Workload Desired Latency (msec) Observed Latency (msec) Throughput Drop? Throughput (MBps)

Cached Sequential Read 182.86 ~350 Yes ~45

Cached Sequential Write 182.86 ~180 No ~90

Disk
VI Client

SAN Cache enabled High throughput

SAN Cache disabled Poor throughput

Disk
esxtop

Latency is quite high

After enabling cache, Latency is reduced

Virtual Machine Optimisation


Deploy all machines from an optimised template!
VMware tools MUST be installed The disks MUST be block aligned to the storage (even when using NFS and SAN) Where possible, always separate data disks from OS disks Windows performance settings should be optimised for application performance Guest operating system timeouts should be set as defined by the SAN vendor Pagefile should be separated where appropriate (this can impact VMware SRM however) Unused Windows services should be disabled (wireless config, print spooler, audio, etc.) Last access update time should be disabled (unless where required) Logging of the VM should be disabled (only enabled for troubleshooting) Remove any unused virtual hardware (floppy drives, USB, etc.) Disable screen savers and power saving features, including logon screen saver Enable Remote Desktop, avoid using the VI Client for remote administration Install standard applications into template (bginfo, AntiVirus, any host agents, etc) Multiple-CPUs should be allocated sparingly

Virtual Machine Optimisation


Block alignment is vital to good disk performance!

Command Action

esxtop
Command Options when inside esxtop

space
?

Update the display


Show the help page

q f/F
o/O

quit Add or Remove columns from the display


Change the order the display is sorted

s #
W

change the update interval change the number of instances to display


Write configuration to file

e V
L

Expand / Rollup CPU Stats View only VM instances


Change the length of the NAME field

m n i d u v

Display memory statistics Display network statistics Display interrupt statistics Display disk adapter statistics Display disk device statistics Display disk VM statistics

esxtop
Command Line Options from the console
Command Action
-b -l -s -a -c -R -d -n batch mode locks the objects available in the first snapshot enables secure mode show all statistics sets the configuration file enables replay mode (used with vm-support S) sets the update interval runs esxtop for n iterations

esxtop

Expand the default window size for your session to get all statistics

vm-support
Creates a packaged zip file containing the following sections:
boot contains the grub configuration etc contains the Console OS configuration files (cron, tcpwrappers, syslog, etc) proc contains much of the hardware configuration modules and variables tmp contains a lot of the ESX specific configuration output var contains log files and any core dumps vmfs contains the structure of the VMFS datastores esx3-installation (where appropriate) contains a copy if the previous esx3 configuration variables

vm-support
Using vm-support to extract performance information:
vm-support S d <duration> -i <interval> <duration> and <interval> are in seconds The output from this can then be replayed in esxtop for review after it has been extracted. esxtop R <path_to_vm-support_output>

Service Console Performance


Multiple Service Console networks for network resiliency Increased Service Console memory upto 800MB Use host agents supplied by your vendors Make storage recommended tweaks such as HBA Queue Depth and IO timeouts Minimal use of the VI Client console RDP or SSH instead Properly sized vCenter server 64bit OS where possible

Resource Groups
Dynamically reallocate resource shares

Additional VM, shares allow you to overcommit resources and have a graceful re-allocation

Remove a VM and exploit extra resources across all remaining VMs

Design Guidelines
Full Resilience / Multiple paths Standard configuration across all aspects (ESX, Storage, Networking, etc.) Standard naming conventions Learn from others mistakes Follow guidelines from vendors best-practices Rule out the basics before requesting support

Capacity Planner & P2V Cautions and Limitations


Peak CPU usage can sometimes be misleading Back-end storage system performance P2V machines will require block-aligning to the storage P2V machines will still require guest OS optimisation

Conclusion
Performance issues can often be traced with simple root cause analysis using basic tools (VI Client / esxtop) Performance tools help diagnose issues and help rule out nonissues Performance tools are useful in different contexts, not always either/or Real-time data and troubleshooting: esxtop Historical data: VI Client Coarse resource / cluster usage: VI Client Detailed resource usage: esxtop Combine information from various tools to get a complete picture Always benchmark your systems first so you not what the optimal performance is that you can receive

Reference Articles
http://www.vmware.com/pdf/esx3_memory.pdf http://www.vmworld.com/docs/DOC-2370 http://blogs.vmware.com/performance/ http://communities.vmware.com/docs/DOC-5420 http://kb.vmware.com/kb/1008205 http://communities.vmware.com/community/vmtn/general/performance http://www.vmware.com/products/vmmark/ http://www.vmware.com/pdf/vsphere4/r40/vsp_40_san_cfg.pdf http://www.vmware.com/pdf/vsphere4/r40/vsp_40_iscsi_san_cfg.pdf http://www.vmware.com/pdf/vsphere4/r40/vsp_40_resource_mgmt.pdf http://www.vmware.com/pdf/GuestOS_guide.pdf http://www.vmware.com/resources/techresources/10066 http://www.vmware.com/resources/techresources/10059 http://www.vmware.com/resources/techresources/10062

Potrebbero piacerti anche