Sei sulla pagina 1di 41

Data Collection Tools

Module 17 Data ONTAP 8.0 7-Mode Administration

Module Objectives
By the end of this module, you should be able to: Use the sysstat, stats, statit, and options commands
Describe the factors that affect RAID performance Execute commands to collect data about write throughput Execute commands to verify the operation of hardware, software, and network components Identify commands and options used to obtain configuration and status
2009 NetApp. All rights reserved.

System Health
Performance problems can originate from multiple sources. To avoid some of these problems, check or monitor the following: Disk configuration
Disk status Write performance Read performance

RAID configuration Connectivity configuration Performance measures


2009 NetApp. All rights reserved.

Disk Status

2009 NetApp. All rights reserved.

Disk Status
Monitor disks:
shelfchk led_on diskid and led_off diskid (priv set advanced command)

Storage Health Monitor:


Simple storage system management service Automatically initiates during system boot Provides background monitoring of individual disk performance Detects impending disk problems before they actually occur disk shm_stats (priv set advanced command)
2009 NetApp. All rights reserved.

Syslog Messages
shm: disk has reported a predicted failure (PFA) event: disk XX, serial_number XXXX shm: link failure detected, upstream from disk: id XX, serial_number XXXXX shm: disk I/O completion times too long: disk XX, serial number XXXXX shm: possible link errors on disk: id XX, serial number XXXXX shm: disk returns excessive recovered errors: disk XX, serial number XXXXX shm: intermittent instability on the loop that is attached to Fibre Channel adapter: id XXX, name XXXXX
2009 NetApp. All rights reserved.

Write Performance

2009 NetApp. All rights reserved.

Write Performance Commands


Use the following commands to research write performance:
Command Function

sysstat
statit stats

Displays current statistics


Displays disk utilization Displays performance data

2009 NetApp. All rights reserved.

Write Performance: sysstat Command


system> sysstat -c 10 -s 5 CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s Cache in out read write read write age 2% 0 0 0 0 0 9 23 0 0 >60 0% 0 0 0 0 0 0 0 0 0 >60 5% 0 0 0 0 0 21 27 0 0 >60 1% 0 0 0 0 0 0 0 0 0 >60 5% 0 0 0 0 0 20 28 0 0 >60 1% 0 0 0 0 0 0 0 0 0 >60 4% 0 0 0 0 0 21 26 0 0 >60 1% 0 0 0 0 0 0 0 0 0 >60 5% 0 0 0 0 0 22 27 0 0 >60 0% 0 0 0 0 0 0 0 0 0 >60 -Summary Statistics (10 samples 5.0 secs/sample) CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s Cache in out read write read write age Min 0% 0 0 0 0 0 0 0 0 0 >60 Avg 2% 0 0 0 0 0 9 13 0 0 >60 Max 5% 0 0 0 0 0 22 28 0 0 >60

2009 NetApp. All rights reserved.

stats: System Performance


The stats command displays statistical data about the storage system and is capable of displaying statistics on every aspect of the storage system Statistics returned using the stats command are based on the following hierarchy:
ObjectsAny entity in the system is an object (physical or logical, including volumes, aggregates, qtrees, disks, and NICs) InstancesAn object such as a volume called nfsflex, or an aggregate called aggr1, or a disk identified as 0b.17 CountersThe counters associated with particular objects and instances

2009 NetApp. All rights reserved.

stats: Examples of Objects and Instances


Examples of objects:
Aggregate Volume Qtree Disk CIFS NFS LUN

Examples of instances:
/vol/vol0, /vol/nfstree, 0b.18 /vol/flex1/lun_test cifs_ops, cifs_latency, cifs_read_ops
2009 NetApp. All rights reserved.

The stats Command


The stats command can be executed in one of three ways, based on the frequency of displays: OnceCurrent counter values are displayed stats show
RepeatingCounter values are displayed at a fixed interval
stats show i 1

PeriodCounter values are gathered over a single period of time and then displayed
stats start then stats stop

2009 NetApp. All rights reserved.

The stats Command (Cont.)


Use stats list counters to see what is available The statistics available through the stats infrastructure are available using other tools such as perfmom, perfstat and Operations Manager The following are examples of stats commands:
system> stats show cifs:cifs:cifs_latency cifs:cifs:cifs_latency:1.92m system> stats show volume:vol0:write_latency volume:vol0:write_latency:171.50us

2009 NetApp. All rights reserved.

Client-Side Tools: Windows Command


The Windows perfmon utility: Connects to the storage system from Windows Requires CIFS to be licensed and running on the storage system Receives output from the stats command and graphs the data
NOTE: To view the Add Counters screen, in the Performance window, click the plus sign (+).
2009 NetApp. All rights reserved.

Read Performance

2009 NetApp. All rights reserved.

Read Performance
Data ONTAP is optimized for write performance Read performance could decrease over time
NOTE: Efficient use of cache can offset some disk performance issues.

Optimization:
To measure optimization:
reallocate measure [vol | file]

To resolve optimization:
reallocate start <pathname>

2009 NetApp. All rights reserved.

RAID Configuration

2009 NetApp. All rights reserved.

RAID Groups

/vol0 rg0

/vol1 rg0

/vol2 rg0

rg1

2009 NetApp. All rights reserved.

RAID Group Size and Composition


The following are some examples of poor RAID configuration choices: Unnecessarily using multiple RAID groups Using mixed disk sizes Configuring RAID groups with wide variations in capacity Configuring RAID groups with only one or two data disks Configuring RAID groups with a number of disks larger than the default
2009 NetApp. All rights reserved.

Initial RAID Group Configuration


Limit the number of disks in a RAID group to the recommended numbers Ensure that each RAID group in an aggregate has approximately the same capacity Ensure that each RAID group in an aggregate has at least three data disks Use disks of the same size within a RAID group to optimize write performance Use RAID-DP to protect against disk failures

2009 NetApp. All rights reserved.

Adding Disks to Existing RAID Groups


Add RAID groups when the applied load is stressing the drives in the current array Add RAID groups and disks before the file system or aggregate is 80% to 90% full Add disks in groups Plan data expansion so that no fewer than three data disks are used for any RAID group

2009 NetApp. All rights reserved.

Monitoring Connectivity

2009 NetApp. All rights reserved.

Connectivity
Use the following to monitor connectivity: MAC
ifconfig ifstat arp

TCP/IP
ifconfig /etc/rc and /etc/hosts ping netstat -r

Protocols
nfsstat cifs stat nbtstat
2009 NetApp. All rights reserved.

Performance Measures

2009 NetApp. All rights reserved.

Measuring NFS Performance


options nfs.per_client_stats.enable [on|off] Recommended to disable when not using nfsstat l
This display shows the breakdown on this mountpoint of lookups, reads, writes, and all operations. The average deviation and the settings for retransmissions of each type also are displayed. The output includes server name and address, mount flags, current read and write sizes, retransmissions count, and timers used for dynamic retransmission.

Data ONTAP NFS Output - Command: nfsstat -l


/n/homesystem from homesystem.corp.com:/home Flags:vers=2,proto=udp,auth=unix,hard,intr,dynamic ,rsize=8192 wsize=8192,retrans=5 Lookups: sttr=7(17ms), dev=4(20ms), cur=2(40ms) Reads: sttr=12(30ms), dev=4(20ms), cur=3(40ms) Writes: sttr=21(52ms), dev=5(25ms), cur=5(100ms) All: sttr=7(7ms), dev=4(20ms), cur=2(40ms)

Round trip response times for specific NFS operations are displayed.

2009 NetApp. All rights reserved.

Measuring CIFS Performance


This number is the total number of operations since smb_hist statistics were last reset. This column represents millisecond (ms) time stamps for operations.

Analyzing smb_hist output CIFS request time processing: (46457) - milliseconds units 0ms 13175 <16ms 4039 1ms 17752 <24ms 2309 2ms 5111 <32ms 569 3ms 664 <40ms 165 4ms 451 <48ms 61 5ms 478 <56ms 21 6ms 570 <64ms 10 7ms 568 unused 0

Every other row displays the number of operations that took place in the interval in the row above it. In this example, 13,715 operations happened in less than .5 ms.

The time interval window lies halfway between the values for adjacent columns. In this example, 165 operations occurred in the 36-ms to 44-ms windows.

2009 NetApp. All rights reserved.

Obtaining Statistics
The statit command:
Is an advanced-mode command used for more detailed analysis of system performance Gathers per-second statistics averaged over the length of time it is running in the background Shows statistics representing all physical and some logical objects on the storage system Most of the data collected represents rates at which things are happening

2009 NetApp. All rights reserved.

Using the statit Command


To obtain statistics using the statit command, complete the following steps: 1. To enter advanced privilege mode, enter:
priv set advanced

2. To begin collecting statistics, enter:


statit b.

3. After 30 seconds (or as necessary to end statistics collection and include NFS statistics), enter:
statit e n

4. To return to normal admin privilege mode, enter:


priv set admin

2009 NetApp. All rights reserved.

Obtaining Statistics
The report generated is divided into the following statistics sections: CPU Multiprocessor CSMP domain switches Miscellaneous WAFL RAID Network interface Disk Aggregate Spares and other disks FCP iSCSI Tape
2009 NetApp. All rights reserved.

CPU Statistics
CPU Statistics 506.934263 time (seconds) 275.044317 system time 23.412966 rupt time 251.466451 non-rupt system time 271.837944 idle time 439.543653 time in CP 21.837230 rupt time in CP 100 % 54 % 5 % (7022 rupts x 0 usec/rupt 50 % 44 % 92 % 100 % 5 % (132 rupts x 0 sec/rupt)

2009 NetApp. All rights reserved.

Multiprocessor Statistics
Multiprocessor Statistics (per second) cpu0 cpu1 46.82 29.15 16.08 0.00 0.00 0.00 1000000.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 total 1424.91 1204.42 119.96 0.00 100.00 0.00 2000000.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

sk switches
hard switches domain switches CP rupts nonCP rupts nonCP rupt usec Idle

1378.09 1175.27 103.89 0.00 100.00 0.00 1000000.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

kahuna
network storage exempt raid target

netcache
netcache2
2009 NetApp. All rights reserved.

Miscellaneous Statistics
Miscellaneous Statistics (per second) 1893.73 hard context switches 0.00 NFS operations 0.00 CIFS operations 0.00 HTTP operations 0.00 NetCache URLs 0.00 streaming packets 0.00 network KB received 0.00 network KB transmitted 18.16 disk KB read 61.30 disk KB written 0.28 NVRAM KB written 0.00 nolog KB written 0.00 WAFL bufs given to clients 0.00 checksum cache hits ( 0%) 0.00 no checksum - partial buffer 0.00 DAFS operations 0.00 FCP operations 0.00 iSCSI operations

2009 NetApp. All rights reserved.

WAFL Rates
WAFL Statistics (per second) 5.96 name cache hits ( 62%) 3.69 name cache misses ( 38%) 19.30 inode cache hits ( 100%) 0.00 inode cache misses ( 0%) 55.06 buf cache hits ( 100%) 0.00 buf cache misses ( 0%) 0.00 blocks read 0.00 blocks read-ahead 0.00 chains read-ahead 0.00 blocks speculative read-ahead 5.11 blocks written 0.57 stripes written 0.00 blocks over-written 0.28 wafl_timer generated CP 0.00 snapshot generated CP 0.00 wafl_avail_bufs generated CP 0.00 dirty_blk_cnt generated CP 0.00 full NV-log generated CP 0.00 back-to-back CP 0.00 flush generated CP 0.00 sync generated CP 0.00 wafl_avail_vbufs generated CP 55.06 non-restart messages 0.00 IOWAIT suspends 604852 buffers

2009 NetApp. All rights reserved.

Network Interface Statistics


Network Interface Statistics (per second) iface side bytes packets multicasts e0 recv 171.69 2.55 0.00 xmit 115.22 1.42 0.00 e9 recv 0.00 0.00 0.00 xmit 0.00 0.00 0.00 e6 recv 0.00 0.00 0.00 xmit 0.00 0.00 0.00 vh recv 0.00 0.00 0.00 xmit 0.00 0.00 0.00

errors 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

collisions 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

2009 NetApp. All rights reserved.

Disk Statistics
Disk Statistics (per second) ut% is the percent of time the disk was busy. xfers is the number of data transfer commands issued per second. xfers = ureads + writes + cpreads + greads + gwrites chain is the average number of 4K blocks per command. usecs is the average disk round trip time per 4K block. disk ut% xfers ureads--chain-usecs writes--chain-usecs cpreads-chain-usecs /vol0/plex0/rg0: 8a.16 5 3.69 0.57 1.00 94500 ... 8a.21 4 3.12 0.57 1.00 39500 ...

2009 NetApp. All rights reserved.

Aggregate, Spares, and Disk Statistics


Aggregate statistics: Minimum 0 0.00 0.00 0.00 0.00 0.00 0.00 Mean 1 0.28 0.00 0.28 0.00 0.00 0.00 Maximum 5 3.69 0.57 3.12 0.00 0.00 0.00

Spares and other disks: 8b.16 2 1.70 1.70 1.00 10167 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .. 8b.17 0 0.00 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 8b.18 0 0.00 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... .

2009 NetApp. All rights reserved.

FCP, iSCSI, and Tape Operations


FCP Statistics (per second) 0.00 FCP Bytes recv 0.00 FCP Bytes sent 0.00 FCP ops iSCSI Statistics (per second) 0.00 iSCSI Bytes recv 0.00 iSCSI Bytes xmit 0.00 iSCSI ops Interrupt Statistics (per second) 2000.15 Clock 3.97 Fast Enet 47.68 FCAL 4.54 int_22 3.41 FCAL 2059.75 total

2009 NetApp. All rights reserved.

Other Resources
For more information about data collection and performance, see the Fundamentals of Performance Analysis course. This advanced course shows you how to: Analyze data using recommended methodology to correlate performance data into performance analysis information Monitor performance using performance tools and establish a baseline of expected throughput and response times for storage systems under planned and increasing workloads Perform capacity planning by monitoring performance and comparing baseline information over time to determine when a storage system will reach maximum capacity Perform tuning for optimal performance for protocols such as CIFS, NFS and SAN (including locating resources with tuning guidelines for database scenarios) Perform bottleneck analysis

2009 NetApp. All rights reserved.

Module Summary
In this module, you should have learned to: Use the sysstat, stats, statit, and options commands
Describe the factors that affect RAID performance Execute commands to collect data about write throughput Execute commands to verify the operation of hardware, software, and network components Identify commands and options used to obtain configuration and status
2009 NetApp. All rights reserved.

Exercise
Module 17: Data Collection Tools Estimated Time: 60 minutes

Check Your Understanding


What command(s) would you use to display disk utilization?
statit

What command(s) would you use to monitor connectivity?


ifconfig, ifstat, arp, ping, netstat

What command(s) would you use to help detect impending disk problems before they occur?
disk shm_stats
2009 NetApp. All rights reserved.

Potrebbero piacerti anche